diff --git a/docs/_scenarios/section2.md b/docs/_scenarios/section2.md
index 8d71b26..c9b131b 100644
--- a/docs/_scenarios/section2.md
+++ b/docs/_scenarios/section2.md
@@ -121,10 +121,10 @@ Here are some short descriptions of each stage:
* Required inputs: Template Raster File, Field Data, Covariate Data.
* Outputs: Updated Field Data and Site Data.
-4. Background Data Generation:
- * Generates background sites (pseudoabsences), extracts covariate data for the pseudoabsence sites and updates the Field Data and Site Data to include pseudoabsence data.
- * Required inputs: Template Raster File, Field Data, Covariate Data, Site Data.
- * Outputs: Updated Field Data and updated Site Data.
+4. Background Data Generation:
+ * Generates background sites (pseudoabsences), extracts covariate data for the pseudoabsence sites and updates the Field Data and Site Data to include pseudoabsence data.
+ * Required inputs: Template Raster File, Field Data, Covariate Data, Site Data.
+ * Outputs: Updated Field Data and updated Site Data.
5. Prepare Training/Testing Data:
* Divides Field Data into training and testing sets and/or cross validation folds based on the provided Validation Options arguments.
diff --git a/docs/_scenarios/section3.md b/docs/_scenarios/section3.md
index 39765e0..e8fc3a9 100644
--- a/docs/_scenarios/section3.md
+++ b/docs/_scenarios/section3.md
@@ -151,6 +151,8 @@ The **Field Data Options** *Datasheet* can be found under the **Field Data** tab
The **Background Data Options** *Datasheet* controls some of the *Scenario's* settings relating to the **Field Data**.
+> If the *Background Data Generation* stage is re-run (e.g., when running a dependent scenario or re-running an existing scenario), any existing background sites already present in the **Field Data** will be removed and regenerated based on the current **Background Data Options**. To preserve existing background data instead of regenerating it, set *Generate background sites* to "No".
+
### **Generate background sites**
Defines whether background sites should be generated for the *Scenario*. Background sites are also referred to as pseudo-absences and represent absences of a species. This information allows the models to compare environmental spaces where a species can and cannot be found. Background sites are often used when true absence data are not available for a species.
diff --git a/docs/_scenarios/section6.md b/docs/_scenarios/section6.md
index 8fa3862..25d77b5 100644
--- a/docs/_scenarios/section6.md
+++ b/docs/_scenarios/section6.md
@@ -115,6 +115,8 @@ The **Probability Map** is the main output of the fitted model and shows probabi
### **MESS Map**
The **MESS Map** is the Multivariate Environmental Similarity Surface, which represents values as positive, negative, or zero. This map shows how well locations on the **Template Raster** fit into the range of covariate data to which the training data were fit. Positive areas on this map represent areas where the covariate ranges are more similar to those to which the training data of the model were fit. Negative areas on this map represent areas where the covariate ranges are not similar to those to which the training data of the model were fit. Values of zero on this map represent areas where ranges of covariate data at these locations and ranges of covariate data to which the training data of the model were fit are marginally similar [(Elith et al., 2010)](https://doi.org/10.1111/j.2041-210X.2010.00036.x).
+> MESS and MoD maps are computed on continuous variables only. If the model includes categorical variables, those variables are excluded from the calculation and a warning is issued. If all model variables are categorical, MESS and MoD maps cannot be generated and will be skipped.
+
### **MoD Map**
The **MoD Map** is a map of the most dissimilar variable. This map is similar to the **MESS Map** in that it shows regions where covariate ranges were most dissimilar from those used to fit the training data. However, this map shows which covariates used in the model was furthest from the range of the observations used for model training and where.
diff --git a/docs/index.md b/docs/index.md
index 2c43665..4bcd84c 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -15,13 +15,13 @@ permalink: /
### WISDM is an open-source SyncroSim package for developing and visualizing species distribution models.
-WISDM was designed to update and replace VisTrails SAHM, a software application originally developed in 2013 by the U.S. Geological Survey (Morisette et al. 2013). The WISDM package streamlines your species distribution modeling (SDM) workflows by preparing data, fitting model ensembles, and visualizing model outputs. WISDM maintains records of the various data inputs, processing steps, and modeling options used during SDM construction, and allows users to customize and run existing SDM methods, such as Generalized Linear Models, Random Forest, and Maxent, without having to interact with different software platforms. WISDM also allows users to easily visualize and compare model scenarios from within the SyncroSim user interface.
+WISDM was designed to update and replace VisTrails SAHM, a software application originally developed in 2013 by the U.S. Geological Survey (Morisette et al. 2013). The WISDM package streamlines your species distribution modeling (SDM) workflows by preparing data, fitting model ensembles, and visualizing model outputs. WISDM maintains records of the various data inputs, processing steps, and modeling options used during SDM construction, and allows users to customize and run existing SDM methods, such as Boosted Regression Trees, Generalized Additive Models, Generalized Linear Models, Maxent, and Random Forest, without having to interact with different software platforms. WISDM also allows users to easily visualize and compare model scenarios from within the SyncroSim user interface.
## Requirements
-This package requires SyncroSim version 2.5.5 or higher.
+This package requires SyncroSim version 3.1.27 or higher.
Java is also required if you choose to run Maxent models within the WISDM package.
diff --git a/src/00-constants.R b/src/00-constants.R
index 0a5f3f8..5680b09 100644
--- a/src/00-constants.R
+++ b/src/00-constants.R
@@ -3,6 +3,11 @@
## ApexRMS, October 2025
## -------------------------
+# Use Cairo backend for PNG rendering on Linux to avoid X11 display dependency
+if (.Platform$OS.type == "unix" && capabilities("cairo")) {
+ options(bitmapType = "cairo")
+}
+
# Raster nodata sentinel value — must match nodataValue in setup_functions.py
nodataValue <- -9999
diff --git a/src/00-helper-functions.R b/src/00-helper-functions.R
index b99d635..9a31136 100644
--- a/src/00-helper-functions.R
+++ b/src/00-helper-functions.R
@@ -555,7 +555,7 @@ safe_rbind <- function(df, row) {
# Launch a Shiny app with browser detection ------------------------------------
-launchShinyApp <- function(appPath) {
+launchShinyApp <- function(appPath, appName = "Viewer") {
# Search PATH first — works on all platforms (Linux, macOS, Windows)
chrome.names <- c(
"google-chrome",
@@ -617,8 +617,32 @@ launchShinyApp <- function(appPath) {
}
}
- # Final fallback: let Shiny use the OS default (xdg-open / open / shell.exec)
- if (is.null(browser.path)) {
+ # On headless Unix (Docker or bare Linux server with no DISPLAY), bind to
+ # 0.0.0.0:3838 so the app is reachable via port forwarding from the host.
+ # A browser is still launched if one is found (e.g. via X11 forwarding).
+ # Chrome/Firefox require --no-sandbox when running as root.
+ # On non-headless Unix or Windows, behaviour is unchanged.
+ headless_unix <- .Platform$OS.type == "unix" &&
+ nchar(Sys.getenv("DISPLAY")) == 0
+
+ if (headless_unix) {
+ port <- 3838
+ progressBar(
+ type = "message",
+ message = paste0(
+ "ACTION REQUIRED: Open http://localhost:",
+ port,
+ " in your browser to use the Interactive ",
+ appName
+ )
+ )
+ shiny::runApp(
+ appDir = appPath,
+ host = "0.0.0.0",
+ port = port,
+ launch.browser = FALSE
+ )
+ } else if (is.null(browser.path)) {
shiny::runApp(appDir = appPath, launch.browser = TRUE)
} else {
shiny::runApp(appDir = appPath, launch.browser = function(shinyurl) {
diff --git a/src/04-background-data-functions.R b/src/04-background-data-functions.R
index 1e296ef..8e5ba95 100644
--- a/src/04-background-data-functions.R
+++ b/src/04-background-data-functions.R
@@ -26,7 +26,8 @@ backgroundSurfaceGeneration <- function(sp, # species
if ('kde' %in% method) {
if(tolower(method$surface) == "continuous"){
- kde_bg_out <- gsub('/', '\\\\', paste0(outputDir, '/', sp, '_kde_bg_surface.tif'))
+ kde_bg_out <- paste0(outputDir, '/', sp, '_kde_bg_surface.tif')
+ if (.Platform$OS.type == "windows") kde_bg_out <- gsub('/', '\\\\', kde_bg_out)
kde.mat <- matrix(m, nrow = length(unique(ud@coords[, 1])))
t <- terra::rast(raster::raster(ud))
diff --git a/src/08-fit-model-functions.R b/src/08-fit-model-functions.R
index d929547..e71ee46 100644
--- a/src/08-fit-model-functions.R
+++ b/src/08-fit-model-functions.R
@@ -9,7 +9,7 @@ library(ROCR)
library(ggplot2)
library(splines)
-# MODEL FIT FUNCTION -----------------------------------------------------------
+# Model Fit Functions ----------------------------------------------------------
## Fit model -------------------------------------------------------------------
@@ -52,7 +52,7 @@ fitModel <- function(
#================================================================
# GLM
- #=================================================================
+ #================================================================
if (out$modType == "glm") {
if (out$pseudoAbs) {
@@ -190,7 +190,7 @@ fitModel <- function(
#================================================================
# RF
- #=================================================================
+ #================================================================
if (out$modType == "rf") {
# set defaults
n.trees = out$modOptions$NumberOfTrees
@@ -286,294 +286,34 @@ fitModel <- function(
#================================================================
# MAXENT
- #=================================================================
+ #================================================================
if (out$modType == "maxent") {
- # If there are parentheses in the working folder name, then the jar file will not run properly
- validTempPath <- sum(grepl("\\(", strsplit(out$tempDir, "")[[1]])) +
- sum(grepl("\\)", strsplit(out$tempDir, "")[[1]])) ==
- 0
-
- if (!validTempPath) {
- stop(paste0(
- "Maxent model will not run if there are parentheses in the",
- " temporary directory path. Please set a different ",
- "temporary folder before continuing."
- ))
- }
-
if (fullFit) {
- # Prepare batch file ----
- capture.output(
- cat("java -mx", out$modOptions$MemoryLimit, "m", sep = ""),
- file = out$batchPath
- )
-
- # core executable
- cat(
- " -jar",
- paste0(
- '"',
- file.path(
- ssimEnvironment()$PackageDirectory,
- "maxent.jar",
- fsep = "\\"
- ),
- '"'
- ),
- file = out$batchPath,
- append = T
- )
- # optional visible interface
- if (!out$modOptions$VisibleInterface) {
- cat(" -z", file = out$batchPath, append = T)
- }
-
- # input files
- cat(
- paste0(' samplesfile="', out$swdPath, '"'),
- file = out$batchPath,
- append = T
+ runMaxent(
+ out = out,
+ samplesfile = out$swdPath,
+ envlayers = out$backgroundPath,
+ outputdir = file.path(out$tempDir, "Outputs"),
+ testsamplesfile = out$testDataPath,
+ fullFit = TRUE
)
- cat(
- paste0(' environmentallayers="', out$backgroundPath, '"'),
- file = out$batchPath,
- append = T
- )
- # test data (if provided)
- if (!is.null(out$testDataPath)) {
- cat(
- paste0(' testsamplesfile="', out$testDataPath, '"'),
- file = out$batchPath,
- append = T
- )
- }
- # factor (categorical) layers
- if (length(out$factorInputVars) > 0) {
- for (i in out$factorInputVars) {
- cat(paste0(" togglelayertype=", i), file = out$batchPath, append = T)
- }
- }
-
- # output directory
- cat(
- paste0(
- ' outputdirectory="',
- file.path(out$tempDir, "Outputs", fsep = "\\"),
- '"'
- ),
- file = out$batchPath,
- append = T
- )
- # performance settings
- cat(
- " threads=",
- out$modOptions$MultiprocessingThreads,
- sep = "",
- file = out$batchPath,
- append = T
- )
- # model complexity settings
- cat(
- " autofeature=",
- tolower(out$modOptions$AutoFeatureSelection),
- sep = "",
- file = out$batchPath,
- append = TRUE
- )
- cat(
- " betamultiplier=",
- out$modOptions$RegularizationMultiplier,
- sep = "",
- file = out$batchPath,
- append = TRUE
- )
- cat(
- " doclamp=",
- tolower(out$modOptions$EnableClamping),
- sep = "",
- file = out$batchPath,
- append = TRUE
- )
-
- # Explicit feature toggles (if auto feature selection is off)
- if (!out$modOptions$AutoFeatureSelection) {
- cat(
- paste0(
- " linear=",
- tolower(out$modOptions$UseLinear),
- " quadratic=",
- tolower(out$modOptions$UseQuadratic),
- " product=",
- tolower(out$modOptions$UseProduct),
- " hinge=",
- tolower(out$modOptions$UseHinge),
- " threshold=",
- tolower(out$modOptions$UseThreshold)
- ),
- file = out$batchPath,
- append = TRUE
- )
- }
-
- # output and diagnostics
- cat(
- " responsecurves jackknife writeclampgrid writemess warnings prefixes",
- file = out$batchPath,
- append = T
- )
-
- # execution controls
- cat(" redoifexists autorun", file = out$batchPath, append = T)
-
- # Note than maxent can't handle spaces in the batch file path
- # - if there are spaces in tempDir, copy the batch file to a system temp file
- # - also update batchPath location in the local scope
- if (str_detect(out$tempDir, " ")) {
- batchTempFile <- tempfile(pattern = "runMaxent", fileext = ".bat")
- file.copy(out$batchPath, batchTempFile, overwrite = T)
- out$batchPath <- batchTempFile
- }
- # run maxent
- shell(out$batchPath)
-
- # read lambdas output
modelMaxent <- read.maxent(file.path(
out$tempDir,
"Outputs",
- "species.lambdas",
- fsep = "\\"
+ "species.lambdas"
))
} else {
- # prepare batch file
- capture.output(
- cat("java -mx", out$modOptions$MemoryLimit, "m", sep = ""),
- file = out$batchPath
- )
- cat(
- " -jar",
- paste0(
- '"',
- file.path(
- ssimEnvironment()$PackageDirectory,
- "maxent.jar",
- fsep = "\\"
- ),
- '"'
- ),
- file = out$batchPath,
- append = T
+ runMaxent(
+ out = out,
+ samplesfile = file.path(out$tempDir, "CVsplits", "training-swd.csv"),
+ envlayers = file.path(out$tempDir, "CVsplits", "background-swd.csv"),
+ outputdir = file.path(out$tempDir, "CVsplits"),
+ fullFit = FALSE
)
- if (!out$modOptions$VisibleInterface) {
- cat(" -z", file = out$batchPath, append = T)
- }
- cat(
- paste0(
- ' samplesfile="',
- file.path(out$tempDir, "CVsplits", "training-swd.csv", fsep = "\\"),
- '"'
- ),
- file = out$batchPath,
- append = T
- )
- cat(
- paste0(
- ' environmentallayers="',
- file.path(out$tempDir, "CVsplits", "background-swd.csv", fsep = "\\"),
- '"'
- ),
- file = out$batchPath,
- append = T
- )
- if (length(out$factorInputVars) > 0) {
- for (i in out$factorInputVars) {
- cat(paste0(" togglelayertype=", i), file = out$batchPath, append = T)
- }
- }
- cat(
- paste0(
- ' outputdirectory="',
- file.path(out$tempDir, "CVsplits", fsep = "\\"),
- '"'
- ),
- file = out$batchPath,
- append = T
- )
- cat(
- " threads=",
- out$modOptions$MultiprocessingThreads,
- sep = "",
- file = out$batchPath,
- append = T
- )
-
- # model complexity settings
- cat(
- " autofeature=",
- tolower(out$modOptions$AutoFeatureSelection),
- sep = "",
- file = out$batchPath,
- append = TRUE
- )
- cat(
- " betamultiplier=",
- out$modOptions$RegularizationMultiplier,
- sep = "",
- file = out$batchPath,
- append = TRUE
- )
- cat(
- " doclamp=",
- tolower(out$modOptions$EnableClamping),
- sep = "",
- file = out$batchPath,
- append = TRUE
- )
-
- # Explicit feature toggles (if auto feature selection is off)
- if (!out$modOptions$AutoFeatureSelection) {
- cat(
- paste0(
- " linear=",
- tolower(out$modOptions$UseLinear),
- " quadratic=",
- tolower(out$modOptions$UseQuadratic),
- " product=",
- tolower(out$modOptions$UseProduct),
- " hinge=",
- tolower(out$modOptions$UseHinge),
- " threshold=",
- tolower(out$modOptions$UseThreshold)
- ),
- file = out$batchPath,
- append = TRUE
- )
- }
-
- cat(
- " writeclampgrid writemess warnings prefixes",
- file = out$batchPath,
- append = T
- ) # reverse these default settings
- cat(" redoifexists autorun", file = out$batchPath, append = T)
-
- # Note than maxent can't handle spaces in the batch file path
- # - if there are spaces in tempDir, copy the batch file to a system temp file
- # - also update batchPath location in the local scope
- if (str_detect(out$tempDir, " ")) {
- batchTempFile <- tempfile(pattern = "runMaxent", fileext = ".bat")
- file.copy(out$batchPath, batchTempFile, overwrite = T)
- out$batchPath <- batchTempFile
- }
-
- # run maxent
- shell(out$batchPath)
-
- # read lambdas output
modelMaxent <- read.maxent(file.path(
out$tempDir,
"CVsplits",
- "species.lambdas",
- fsep = "\\"
+ "species.lambdas"
))
}
return(modelMaxent)
@@ -581,7 +321,7 @@ fitModel <- function(
#================================================================
# BRT
- #=================================================================
+ #================================================================
if (out$modType == "brt") {
# set n-folds
if (out$validationOptions$CrossValidate) {
@@ -646,7 +386,7 @@ fitModel <- function(
#================================================================
# GAM
- #=================================================================
+ #================================================================
if (out$modType == "gam") {
# calculating the case weights
@@ -774,23 +514,90 @@ fitModel <- function(
}
}
-### check java installation ----------------------------------------------------
-# function to check if java is installed and available on system path
+## Model Fitting Helper Functions ----------------------------------------------
+
+### Check Java Installation ----------------------------------------------------
+# SyncroSim 3.1.28+ sets a minimal PATH (System32 + conda dirs only) to prevent
+# competing GDAL/GEOS/PROJ DLL conflicts, which can strip Java from PATH. If
+# java is not found on PATH, fall back to JAVA_HOME and patch the R session
+# PATH so all subsequent calls (including system2() for MaxEnt) work.
checkJava <- function() {
os <- Sys.info()[["sysname"]]
- # set system call
- cmd <- "java -version"
+ if (nchar(Sys.which("java")) == 0) {
+ javaBinDir <- NULL
- # run the command and capture exit code
- status <- tryCatch(
- {
+ # fallback 1: JAVA_HOME environment variable
+ javaHome <- Sys.getenv("JAVA_HOME")
+ if (nchar(javaHome) > 0) {
+ javaBin <- file.path(
+ javaHome,
+ "bin",
+ if (os == "Windows") "java.exe" else "java"
+ )
+ if (file.exists(javaBin)) javaBinDir <- file.path(javaHome, "bin")
+ }
+
+ # fallback 2: search common installation directories
+ if (is.null(javaBinDir)) {
if (os == "Windows") {
- shell(cmd, intern = FALSE, ignore.stdout = TRUE, ignore.stderr = TRUE)
+ commonRoots <- Filter(
+ nchar,
+ c(Sys.getenv("ProgramFiles"), Sys.getenv("ProgramFiles(x86)"))
+ )
+ commonVendors <- c(
+ "Java",
+ "Eclipse Adoptium",
+ "Microsoft",
+ "BellSoft",
+ "Zulu",
+ "Amazon Corretto"
+ )
+ for (root in commonRoots) {
+ for (vendor in commonVendors) {
+ for (jdk in list.dirs(file.path(root, vendor), recursive = FALSE)) {
+ if (file.exists(file.path(jdk, "bin", "java.exe"))) {
+ javaBinDir <- file.path(jdk, "bin")
+ break
+ }
+ }
+ if (!is.null(javaBinDir)) break
+ }
+ if (!is.null(javaBinDir)) break
+ }
} else {
- system(cmd, ignore.stdout = TRUE, ignore.stderr = TRUE)
+ for (d in c(
+ "/usr/bin",
+ "/usr/local/bin",
+ "/usr/lib/jvm/default/bin",
+ "/usr/lib/jvm/java/bin"
+ )) {
+ if (file.exists(file.path(d, "java"))) {
+ javaBinDir <- d
+ break
+ }
+ }
}
+ }
+
+ if (!is.null(javaBinDir)) {
+ javaBinDir <- normalizePath(javaBinDir, winslash = "/", mustWork = FALSE)
+ Sys.setenv(
+ PATH = paste(javaBinDir, Sys.getenv("PATH"), sep = .Platform$path.sep)
+ )
+ updateRunLog(paste0(
+ "\nJava found at '",
+ javaBinDir,
+ "' and added to session PATH."
+ ))
+ }
+ }
+
+ # run the command and capture exit code
+ status <- tryCatch(
+ {
+ system2("java", "-version", stdout = FALSE, stderr = FALSE)
},
error = function(e) 1L
)
@@ -804,6 +611,86 @@ checkJava <- function() {
FALSE
}
}
+
+### Run Maxent -----------------------------------------------------------------
+# Invokes MaxEnt directly via system2("java") — cross-platform, no batch file.
+# system2() handles path quoting so spaces and special characters in paths work.
+
+runMaxent <- function(
+ out,
+ samplesfile,
+ envlayers,
+ outputdir,
+ testsamplesfile = NULL,
+ fullFit = TRUE
+) {
+ jarPath <- file.path(ssimEnvironment()$PackageDirectory, "maxent.jar")
+
+ args <- c(paste0("-mx", out$modOptions$MemoryLimit, "m"), "-jar", jarPath)
+
+ if (!out$modOptions$VisibleInterface) {
+ args <- c(args, "-z")
+ }
+
+ args <- c(
+ args,
+ paste0("samplesfile=", samplesfile),
+ paste0("environmentallayers=", envlayers)
+ )
+
+ if (!is.null(testsamplesfile)) {
+ args <- c(args, paste0("testsamplesfile=", testsamplesfile))
+ }
+
+ if (length(out$factorInputVars) > 0) {
+ args <- c(args, paste0("togglelayertype=", out$factorInputVars))
+ }
+
+ args <- c(
+ args,
+ paste0("outputdirectory=", outputdir),
+ paste0("threads=", out$modOptions$MultiprocessingThreads),
+ paste0("autofeature=", tolower(out$modOptions$AutoFeatureSelection)),
+ paste0("betamultiplier=", out$modOptions$RegularizationMultiplier),
+ paste0("doclamp=", tolower(out$modOptions$EnableClamping))
+ )
+
+ if (!out$modOptions$AutoFeatureSelection) {
+ args <- c(
+ args,
+ paste0("linear=", tolower(out$modOptions$UseLinear)),
+ paste0("quadratic=", tolower(out$modOptions$UseQuadratic)),
+ paste0("product=", tolower(out$modOptions$UseProduct)),
+ paste0("hinge=", tolower(out$modOptions$UseHinge)),
+ paste0("threshold=", tolower(out$modOptions$UseThreshold))
+ )
+ }
+
+ if (fullFit) {
+ args <- c(
+ args,
+ "responsecurves",
+ "jackknife",
+ "writeclampgrid",
+ "writemess",
+ "warnings",
+ "prefixes"
+ )
+ } else {
+ args <- c(args, "writeclampgrid", "writemess", "warnings", "prefixes")
+ }
+
+ args <- c(args, "redoifexists", "autorun")
+
+ exitCode <- system2("java", args = args)
+ if (exitCode != 0) {
+ stop(paste0(
+ "MaxEnt execution failed with exit code ", exitCode,
+ ". Check Java installation and MaxEnt configuration."
+ ))
+ }
+}
+
### Read Maxent ----------------------------------------------------------------
# function to read in maxent lambdas file and extract coefficients for each feature type
@@ -1003,7 +890,7 @@ est.lr <- function(dat, out) {
}
}
-# MODEL SELECTION AND VALIDATION FUNCTIONS -------------------------------------
+# Model Selection and Validation Functions -------------------------------------
## Run Cross Validation --------------------------------------------------------
@@ -1124,7 +1011,9 @@ cv.fct <- function(
if (is.null(cv.final.mod)) {
stop(paste0(
- "CV fold ", i, " model fitting failed. ",
+ "CV fold ",
+ i,
+ " model fitting failed. ",
"Consider removing or reclassifying rare factor levels, reducing the number of CV folds, ",
"or reviewing the data for outliers or class imbalance."
))
@@ -1166,7 +1055,10 @@ cv.fct <- function(
valid_i <- !is.na(u_i)
if (any(!valid_i)) {
updateRunLog(paste0(
- "\nWarning: ", sum(!valid_i), " site(s) in CV fold ", i,
+ "\nWarning: ",
+ sum(!valid_i),
+ " site(s) in CV fold ",
+ i,
" could not be predicted and will be excluded from fold evaluation.",
" This is likely caused by a factor level absent from this fold's training data.\n"
))
@@ -1177,7 +1069,9 @@ cv.fct <- function(
if (all(is.na(u_i))) {
stop(paste0(
- "CV fold ", i, " produced no valid predictions. This is likely caused by a categorical ",
+ "CV fold ",
+ i,
+ " produced no valid predictions. This is likely caused by a categorical ",
"variable with a factor level that is absent from this fold's training data. ",
"Consider removing or reclassifying rare factor levels, or reducing the number of CV folds."
))
@@ -1186,7 +1080,9 @@ cv.fct <- function(
if (family == "binomial" | family == "bernoulli") {
if (length(unique(y_i)) < 2) {
stop(paste0(
- "CV fold ", i, " contains only one response class after excluding unpredictable sites. ",
+ "CV fold ",
+ i,
+ " contains only one response class after excluding unpredictable sites. ",
"Consider removing or reclassifying rare factor levels, or reducing the number of CV folds."
))
}
@@ -1263,7 +1159,7 @@ cv.fct <- function(
}
-### Calculate Deviance function -----------[see helper functions]---------------
+## Model Evaluation Helper Functions -------------------------------------------
### ROC Function ---------------------------------------------------------------
@@ -1300,7 +1196,7 @@ roc <- function(
return(round(wilc, 4))
}
-### Calibration function -------------------------------------------------------
+### Calibration Function -------------------------------------------------------
calibration <- function(
obs, # observed response
@@ -1343,7 +1239,7 @@ calibration <- function(
return(calibration.result)
}
-### Permute Predict function ---------------------------------------------------
+### Permute Predict Function ---------------------------------------------------
permute.predict <- function(
inputVars, # input variables for model fitting
@@ -1372,7 +1268,7 @@ permute.predict <- function(
return(AUC)
}
-# MODEL OUTPUT FUNCTIONS -------------------------------------------------------
+# Model Output Functions -------------------------------------------------------
## Make Model Evaluation Plots -------------------------------------------------
@@ -2070,7 +1966,7 @@ makeModelEvalPlots <- function(out = out) {
return(out)
}
-### Calculate statistics function ----------------------------------------------
+### Calculate Statistics Function ----------------------------------------------
calcStat <- function(
x, # x <- out$data[[i]]
@@ -2209,7 +2105,7 @@ calcStat <- function(
}
}
-### Variable Importance function -----------------------------------------------
+### Variable Importance Function -----------------------------------------------
VariableImportance <- function(
out, # out list
@@ -2494,7 +2390,7 @@ VariableImportance <- function(
title(ylab = "Variables", line = 14, cex.lab = 3, font.lab = 2)
}
-### Confusion Matrix function --------------------------------------------------
+### Confusion Matrix Function --------------------------------------------------
confusion.matrix <- function(
Stats, # output from calcStat function
@@ -2646,7 +2542,7 @@ confusion.matrix <- function(
)
}
-### Residual Image function ----------------------------------------------------
+### Residual Image Function ----------------------------------------------------
resid.image <- function(dev.contrib, dat, file.name, label, create.image = T) {
#produces a map of deviance residuals unless we're using independent evaluation data in which case
@@ -2807,7 +2703,7 @@ beachcolours <- function(
}
-### Test/Train ROC Plot function -----------------------------------------------
+### Test/Train ROC Plot Function -----------------------------------------------
TestTrainRocPlot <- function(
dat, # Stats$train$auc.data
@@ -3352,7 +3248,7 @@ TestTrainRocPlot <- function(
par(op)
}
-### Presence-Only Smoothed Calibration Plot function ---------------------------
+### Presence-Only Smoothed Calibration Plot Function ---------------------------
pocplot <- function(pred, back, linearize = TRUE, ...) {
ispresence <- c(rep(1, length(pred)), rep(0, length(back)))
@@ -3380,7 +3276,7 @@ pocplot <- function(pred, back, linearize = TRUE, ...) {
predd
}
-### Presence-Absence Smoothed Calibration Plot function ------------------------
+### Presence-Absence Smoothed Calibration Plot Function ------------------------
pacplot <- function(pred, pa, ...) {
predd <- smoothdist(preds = pred, obs = pa)
@@ -3394,7 +3290,7 @@ pacplot <- function(pred, pa, ...) {
)
}
-#### Plotting function for Calibration plots [nested in pocplot/pacplot] -------
+#### Plotting Function for Calibration Plots [nested in pocplot/pacplot] -------
calibplot <- function(
pred,
@@ -3443,7 +3339,7 @@ calibplot <- function(
}
}
-#### Smoothing function for Calibration plots [nested in pocplot/pacplot] ------
+#### Smoothing Function for Calibration Plots [nested in pocplot/pacplot] ------
smoothingdf <- 6
smoothdist <- function(preds, obs) {
@@ -3484,7 +3380,7 @@ smoothdist <- function(preds, obs) {
data.frame(x = x, y = y$fit, se = y$se.fit)
}
-### Capture Statistics function ------------------------------------------------
+### Capture Statistics Function ------------------------------------------------
capture.stats <- function(
Stats.lst, # stats or lst output from calcStat function
@@ -3904,7 +3800,7 @@ capture.stats <- function(
}
}
-## response curves function -----------------------------------------------------
+## Response Curves Function -----------------------------------------------------
response.curves <- function(out) {
# Desanitize output variable names
diff --git a/src/10-ensemble-model.py b/src/10-ensemble-model.py
index f3878fe..1eba855 100644
--- a/src/10-ensemble-model.py
+++ b/src/10-ensemble-model.py
@@ -54,7 +54,7 @@ def run():
# Get path to scenario inputs
ssimInputDir = myScenario.library.location + \
- ".input\\Scenario-" + str(myScenario.sid)
+ ".input/Scenario-" + str(myScenario.sid)
# Load datasheets
# inputs
diff --git a/src/2-spatial-data-preparation.py b/src/2-spatial-data-preparation.py
index b71ad30..aa7a6f7 100644
--- a/src/2-spatial-data-preparation.py
+++ b/src/2-spatial-data-preparation.py
@@ -180,7 +180,7 @@ def check_raster_range(filepath, min_allowed=0.0, max_allowed=1.0, epsilon=1e-8)
tiledTemplatePath = os.path.join(
ssimTempDir, base + "_tiled" + ext)
profile = src.profile.copy()
- profile.update(tiled=True, compress=rasterCompression)
+ profile.update(tiled=True, compress=rasterCompression, blockxsize=256, blockysize=256)
with rasterio.open(tiledTemplatePath, 'w', **profile) as dst:
for _, window in src.block_windows(1):
dst.write(src.read(window=window), window=window)
diff --git a/src/6-variable-reduction.R b/src/6-variable-reduction.R
index abe4061..5174383 100644
--- a/src/6-variable-reduction.R
+++ b/src/6-variable-reduction.R
@@ -253,7 +253,6 @@ if (
names(covsDE) <- devInfo$covDE
# run pairs explore with all variables -----------------------------------------
-
options <- covariateSelectionSheet
options$NumberOfPlots <- ncol(select(
siteData,
@@ -278,7 +277,7 @@ if (
# covsDE
options <- covariateSelectionSheet
- launchShinyApp(file.path(packageDir, "06-covariate-correlation-app.R"))
+ launchShinyApp(file.path(packageDir, "06-covariate-correlation-app.R"), appName = "Correlation Viewer")
# save image files
covariateCorrelationSheet <- safe_rbind(
diff --git a/src/7-hyperparameter-tuning.R b/src/7-hyperparameter-tuning.R
index 9e3ad42..bd32099 100644
--- a/src/7-hyperparameter-tuning.R
+++ b/src/7-hyperparameter-tuning.R
@@ -490,7 +490,7 @@ combineTxtFiles(filePaths = txtFilePaths, outputPath = file.path(ssimTempDir, "O
# Shiny App --------------------------------------------------------------------
-launchShinyApp(file.path(packageDir, "07-hyperparameter-tuning-app.R"))
+launchShinyApp(file.path(packageDir, "07-hyperparameter-tuning-app.R"), appName = "Hyperparameter Tuning App")
selectedComboOutputs <- comboImgs[comboImgs$displayName == comboOut,]
progressBar()
diff --git a/src/8-fit-glm.R b/src/8-fit-glm.R
index dfced59..061f552 100644
--- a/src/8-fit-glm.R
+++ b/src/8-fit-glm.R
@@ -261,7 +261,7 @@ progressBar()
finalMod <- fitModel(dat = trainingData, out = out)
# save model to temp storage
-# saveRDS(finalMod, file = paste0(ssimTempDir,"\\Data\\", modType, "_model.rds"))
+# saveRDS(finalMod, file = file.path(ssimTempDir, "Data", paste0(modType, "_model.rds")))
# add relevant model details to out
out$finalMod <- finalMod
diff --git a/src/8-fit-maxent.R b/src/8-fit-maxent.R
index 9b1ef8b..f7e9b4c 100644
--- a/src/8-fit-maxent.R
+++ b/src/8-fit-maxent.R
@@ -305,12 +305,7 @@ dir.create(file.path(ssimTempDir, "Outputs"))
## training data
-out$swdPath <- swdPath <- file.path(
- ssimTempDir,
- "Inputs",
- "training-swd.csv",
- fsep = "\\"
-)
+out$swdPath <- swdPath <- file.path(ssimTempDir, "Inputs", "training-swd.csv")
trainingData %>%
mutate(Species = case_when(Response == 1 ~ "species")) %>%
@@ -328,12 +323,7 @@ trainingData %>%
out$data$train <- trainingData
if (pseudoAbs) {
- out$backgroundPath <- backgroundPath <- file.path(
- ssimTempDir,
- "Inputs",
- "background-swd.csv",
- fsep = "\\"
- )
+ out$backgroundPath <- backgroundPath <- file.path(ssimTempDir, "Inputs", "background-swd.csv")
trainingData %>%
mutate(Species = case_when(Response != 1 ~ "background")) %>%
@@ -352,12 +342,7 @@ if (pseudoAbs) {
## testing data
if (!is.null(testingData)) {
- out$testDataPath <- testDataPath <- file.path(
- ssimTempDir,
- "Inputs",
- "testing-swd.csv",
- fsep = "\\"
- )
+ out$testDataPath <- testDataPath <- file.path(ssimTempDir, "Inputs", "testing-swd.csv")
testingData %>%
mutate(Species = case_when(Response == 1 ~ "species")) %>%
drop_na(Species) %>%
@@ -376,7 +361,6 @@ if (!is.null(testingData)) {
gc()
}
-out$batchPath <- file.path(ssimTempDir, "Inputs", "runMaxent.bat", fsep = "\\")
# out$maxJobs <- mulitprocessingSheet$MaximumJobs
# Create output text file ------------------------------------------------------
@@ -416,7 +400,7 @@ finalMod <- fitModel(
# finalMod$trainingData <- trainingData
# save model to temp storage
-# saveRDS(finalMod, file = paste0(ssimTempDir,"\\Data\\", modType, "_model.rds"))
+# saveRDS(finalMod, file = file.path(ssimTempDir, "Data", paste0(modType, "_model.rds")))
# finalMod$trainingData <- NULL
# add relevant model details to out
diff --git a/src/9-apply-model.R b/src/9-apply-model.R
index 4fcc4d1..d88a87b 100644
--- a/src/9-apply-model.R
+++ b/src/9-apply-model.R
@@ -176,7 +176,8 @@ if (name(myLibrary) == "Partial") {
sessionDetails <- setup_session(
ssim_temp_dir = ssimTempDir,
concurrent_sessions = maxJobs,
- total_ram_gb = totalMem
+ total_ram_gb = totalMem,
+ desync_max_sec = 10
)
progressBar()
diff --git a/src/package.xml b/src/package.xml
index c15a41b..24cc6f6 100644
--- a/src/package.xml
+++ b/src/package.xml
@@ -1,5 +1,5 @@
-
+
@@ -253,8 +253,8 @@
displayName="1 - Prepare Multiprocessing"
programArguments="1-prep-multiprocessing.py"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-py-conda.yml"
+ condaEnvVersion="6">
@@ -265,8 +265,8 @@
displayName="2 - Spatial Data Preparation"
programArguments="2-spatial-data-preparation.py"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-py-conda.yml"
+ condaEnvVersion="6">
@@ -278,8 +278,8 @@
displayName="3 - Site Data Preparation"
programArguments="3-site-data-preparation.py"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-py-conda.yml"
+ condaEnvVersion="6">
@@ -293,8 +293,8 @@
displayName="4 - Background Data Generation"
programArguments="4-background-data-generation.R"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-r-conda.yml"
+ condaEnvVersion="6">
@@ -308,8 +308,8 @@
displayName="5 - Prepare Training/Testing Data"
programArguments="5-prepare-training-testing-data.R"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-r-conda.yml"
+ condaEnvVersion="6">
@@ -321,8 +321,8 @@
displayName ="6 - Variable Reduction"
programArguments="6-variable-reduction.R"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-r-conda.yml"
+ condaEnvVersion="6">
@@ -336,8 +336,8 @@
displayName="7 - Hyperparameter Tuning"
programArguments="7-hyperparameter-tuning.R"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-r-conda.yml"
+ condaEnvVersion="6">
@@ -357,8 +357,8 @@
displayName="8 - Generalized Linear Model"
programArguments="8-fit-glm.R"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-r-conda.yml"
+ condaEnvVersion="6">
@@ -373,8 +373,8 @@
displayName="8 - Generalized Additive Model"
programArguments="8-fit-gam.R"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-r-conda.yml"
+ condaEnvVersion="6">
@@ -389,8 +389,8 @@
displayName="8 - Random Forest"
programArguments="8-fit-rf.R"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-r-conda.yml"
+ condaEnvVersion="6">
@@ -405,8 +405,8 @@
displayName="8 - Maxent"
programArguments="8-fit-maxent.R"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-r-conda.yml"
+ condaEnvVersion="6">
@@ -422,8 +422,8 @@
displayName="8 - Boosted Regression Trees"
programArguments="8-fit-brt.R"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-r-conda.yml"
+ condaEnvVersion="6">
@@ -438,8 +438,8 @@
name="ApplyModel"
displayName="9 - Apply Model"
programArguments="9-apply-model.R"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-r-conda.yml"
+ condaEnvVersion="6">
@@ -454,8 +454,8 @@
displayName="10 - Ensemble Model"
programArguments="10-ensemble-model.py"
isMultiprocessing="False"
- condaEnv="wisdm-conda.yml"
- condaEnvVersion="5">
+ condaEnv="wisdm-py-conda.yml"
+ condaEnvVersion="6">
diff --git a/src/wisdm-conda.yml b/src/wisdm-conda.yml
deleted file mode 100644
index 30c6956..0000000
--- a/src/wisdm-conda.yml
+++ /dev/null
@@ -1,37 +0,0 @@
-name: wisdm-conda
-channels:
- - conda-forge
-dependencies:
- - r-base=4.1.3
- - r-adehabitathr=0.4.21
- - r-data.table=1.15.4
- - r-dismo=1.3_5
- - r-dplyr=1.1.4
- - r-gbm=2.1.9
- - r-ggplot2=3.4.2
- - r-glmnet=4.1_2
- - r-gridextra=2.3
- - r-mgcv=1.9_1
- - r-pander=0.6.5
- - r-png=0.1_8
- - r-presenceabsence=1.1.11
- - r-prroc=1.3.1
- - r-randomforest=4.7_1.1
- - r-rocr=1.0_11
- - r-rsyncrosim=2.1.3
- - r-sf=1.0_7
- - r-shiny=1.7.4
- - r-sp=2.1_4
- - r-spatstat.geom=3.2_9
- - r-terra=1.5_21
- - r-tidyr=1.3.1
- - r-tidyverse=2.0.0
- - r-xml2=1.3.3
- - r-zip=2.3.1
- - python=3.12.4
- - dask=2024.7.0
- - geopandas=1.0.1
- - pysyncrosim=2.1.0
- - pywin32=306
- - rasterio=1.3.10
- - rioxarray=0.16.0
diff --git a/src/wisdm-py-conda.yml b/src/wisdm-py-conda.yml
new file mode 100644
index 0000000..b8b8382
--- /dev/null
+++ b/src/wisdm-py-conda.yml
@@ -0,0 +1,12 @@
+name: wisdm-py-conda
+channels:
+ - conda-forge
+dependencies:
+ # Python — no R spatial stack in this env so there are no GDAL/GEOS/PROJ
+ # cross-language constraints. Packages are pinned to tested versions.
+ - python=3.12.4
+ - dask=2024.7.0
+ - geopandas=1.0.1
+ - pysyncrosim=2.2.0
+ - rasterio=1.3.10
+ - rioxarray=0.16.0
diff --git a/src/wisdm-r-conda.yml b/src/wisdm-r-conda.yml
new file mode 100644
index 0000000..e2a47df
--- /dev/null
+++ b/src/wisdm-r-conda.yml
@@ -0,0 +1,38 @@
+name: wisdm-r-conda
+channels:
+ - conda-forge
+dependencies:
+ # R base — pinned to 4.1.x; r-rsyncrosim only has r41 builds on Windows,
+ # and r-terra/r-dismo/r-sf require the legacy msys2 (m2w64) toolchain on
+ # Windows which is incompatible with newer compiled R packages. Update when
+ # r-rsyncrosim and r-terra publish cross-platform builds for a newer R version.
+ - r-base=4.1.3
+ # R packages — all versions confirmed to have r41 builds on both linux-64 and
+ # win-64. Some versions differ from the original Windows-only pins because the
+ # newer Windows versions (e.g. r-data.table=1.15.4, r-sp=2.1_4) have no r41
+ # linux-64 build; these are pinned to the latest version that does.
+ - r-adehabitathr=0.4.21
+ - r-data.table=1.14.8
+ - r-dismo=1.3_5
+ - r-dplyr=1.1.2
+ - r-gbm=2.1.8.1
+ - r-ggplot2=3.4.2
+ - r-glmnet=4.1_2
+ - r-gridextra=2.3
+ - r-mgcv=1.8_42
+ - r-pander=0.6.5
+ - r-png=0.1_8
+ - r-presenceabsence=1.1.11
+ - r-prroc=1.3.1
+ - r-randomforest=4.7_1.1
+ - r-rocr=1.0_11
+ - r-rsyncrosim=2.1.13
+ - r-sf=1.0_7
+ - r-shiny=1.7.4
+ - r-sp=1.6_1
+ - r-spatstat.geom=3.2_1
+ - r-terra=1.5_21
+ - r-tidyr=1.3.0
+ - r-tidyverse=2.0.0
+ - r-xml2=1.3.3
+ - r-zip=2.3.0