GemPy Engine is a Python library for implicit 3D geological modeling using cokriging interpolation. It generates scalar fields from surface point and orientation data, refines them over an octree, and optionally extracts surfaces via dual contouring.
Key dependencies: NumPy, PyTorch (optional), PyKeOps (optional), FastAPI (server), subsurface (output export).
The virtual environment for this project is located at /home/leguark/.venv/2025. To avoid package collision or import errors, always run commands using the virtualenv's direct executables rather than global ones.
# Install in dev mode
/home/leguark/.venv/2025/bin/pip install -e ".[dev]"
# Run tests with the default/numpy backend
DEFAULT_BACKEND=numpy /home/leguark/.venv/2025/bin/pytest
# Run tests and skip approval tests
DEFAULT_BACKEND=numpy /home/leguark/.venv/2025/bin/pytest -k "not approval"
# Run a specific test file
DEFAULT_BACKEND=numpy /home/leguark/.venv/2025/bin/pytest tests/test_common/test_api/test_public_interface.py
# Run the FastAPI server
/home/leguark/.venv/2025/bin/uvicorn gempy_engine.API.server.main_server_pro:gempy_engine_AppNote: If you run a generic pytest command, you may hit ImportPathMismatchError due to leftovers in build directories (e.g., build/lib/tests/conftest.py vs tests/conftest.py). If this occurs, clean up the build artifacts first via rm -rf build/.
API/ ← Public entry points
model/model_api.py → compute_model() — the one main entry point
interp_single/ → Octree-level interpolation loop + stack management
dual_contouring/ → Multi-scalar dual contouring mesh extraction
server/ → FastAPI endpoint wrapping compute_model()
core/ ← Data structures, backend abstraction, utilities
backend_tensor.py → BackendTensor singleton (NumPy vs PyTorch, CPU/GPU, PyKeOps)
data/ → All domain data classes (grids, inputs, options, outputs)
config.py → Environment-based configuration (AvailableBackends, feature flags)
exceptions.py → GemPyEngineInputError
modules/ ← Computational modules (no direct outside deps)
kernel_constructor/ → Builds covariance matrices and kernel evaluation
solver/ → Linear system solvers (NumPy, PyTorch, PyKeOps CG, GMRES)
activator/ → Soft-sigmoid segmentation of scalar fields into unit blocks
dual_contouring/ → Mesh generation from scalar field on octree corners
evaluator/ → Evaluates weights against kernel to produce scalar fields
octrees_topology/ → Octree refinement based on scalar field curvature
data_preprocess/ → Prepares raw input for interpolation
faults/ → Finite fault handling
geophysics/ → Forward gravity and magnetic computation
weights_cache/ → Disk/memory cache for interpolation weights
topology/ → Topology edge extraction from octree
compute_model(InterpolationInput, InterpolationOptions, InputDataDescriptor)
│
├─ _check_input_validity() ← Pre-computation validity check
│
├─ interpolate_n_octree_levels()
│ └─ for each octree level:
│ └─ interpolate_on_octree()
│ └─ interpolate_all_fields()
│ └─ _interpolate_stack() (or _interpolate_stack_flat)
│ └─ for each stack (geological series):
│ ├─ input_preprocess() → SolverInput
│ ├─ compute_weights() → solve Ax=b → weights
│ │ └─ solver_interface.kernel_reduction()
│ ├─ _evaluate_sys_eq() → evaluate kernel → ExportedFields
│ └─ _segment() → sigmoid activation → values block
│ └─ combine_scalar_fields() → erosion/onlap masking
│ └─ get_next_octree_grid() → refine octree for next level
│
├─ dual_contouring_multi_scalar() (if mesh_extraction=True)
│
└─ Solutions(octrees_output, dc_meshes, gravity, magnetics)
BackendTensor is a class with mutable class-level state (not instantiated). It controls:
- Engine backend:
AvailableBackends.numpyorAvailableBackends.PYTORCH. - GPU:
use_gpuflag (PyTorch only). - PyKeOps:
pykeops_enabled/use_pykeopsfor GPU-accelerated kernel ops.
The class provides aliases for backend operations: BackendTensor.t (the active backend module, e.g. NumPy or PyTorch), BackendTensor.tfnp (same, primarily for numpy-like syntax). All computation-agnostic code uses BackendTensor.t.array(...), BackendTensor.t.zeros(...), etc.
Critical: pykeops_enabled is dynamically toggled at runtime by compute_weights() and _evaluate_sys_eq() in _interp_scalar_field.py — they set and restore it per-call. Do not rely on the class-level value staying constant during a compute_model() run.
Test backend is configured via tests/conftest.py:
BackendTensor._change_backend(engine_backend=backend, use_gpu=use_gpu)Default test backend is AvailableBackends.numpy, GPU off.
The modeling is organized into stacks (geological series/sequences) each containing multiple surfaces (geological boundaries).
StacksStructure: Describes the organization — number of surfaces, points, orientations per stack, masking relations (erosion, onlap, fault), and fault dependency matrix.TensorsStructure: Describes number of surface points per surface (lower-level split).InputDataDescriptor: PairsTensorsStructure+StacksStructure. Frozen dataclass.
Stacks are interpolated independently, then combined via masking:
- Erosion: Upper stack's scalar field truncates lower stacks.
- Onlap: Lower stack's scalar field truncates upper stacks.
- Fault: Output modifies the combined block with fault offsets.
Before interpolation starts, the compute_model entry point runs a comprehensive check _check_input_validity(...). It raises a GemPyEngineInputError if it detects inconsistencies:
- Mismatches in orientation dip positions vs gradients lengths.
- Inconsistencies between
InterpolationInputsize and expectations set inTensorsStructureorStacksStructure(e.g. point/surface counts). - Stacks or surfaces containing zero points or zero orientations.
InterpolationOptions: PydanticBaseModel(withConfigDict). Holdskernel_options,evaluation_options,cache_mode,sigmoid_slope, etc.KernelOptions: Standard@dataclass(not Pydantic). Holdsrange,c_o,uni_degree,kernel_function,kernel_solver,compute_condition_number.EvaluationOptions:@dataclass. Holds octree level counts, mesh extraction flags, gradient computation.
Implicit Deprecation Gotcha:
Several high-level attributes on InterpolationOptions are deprecated (such as compute_scalar_gradient, number_octree_levels, number_octree_levels_surface, and mesh_extraction). When modifying or creating configuration, access/set these on options.evaluation_options instead (e.g., options.evaluation_options.number_octree_levels).
EngineGrid is a composite that holds multiple grid types: octree_grid, dense_grid, custom_grid, topography, sections, geophysics_grid, corners_grid. Its .values property concatenates all non-None grids in a fixed order. Slices (octree_grid_slice, dense_grid_slice, etc.) partition the concatenated array to extract per-grid sub-arrays from final blocks.
The engine iterates over number_octree_levels levels. At each level:
- Evaluate scalar field at current grid points
- If not the last level, compute next octree grid from scalar field curvature
- Set the new octree grid on
interpolation_inputviaset_temp_grid()
The final octree output level's InterpOutput.combined_scalar_field.final_block is the geological model result.
For NumPy backend (not PyTorch), compute_model() does copy.deepcopy(interpolation_input) to avoid side effects. Controlled by NOT_MAKE_INPUT_DEEP_COPY env var. The deep copy is not done for PyTorch backend — be aware of this when debugging PyTorch vs NumPy behavior differences.
Configuration lives in gempy_engine/config.py, loaded from .env or ~/.env_gempy_engine. The local .env is prioritized if both exist.
| Variable | Default | Purpose |
|---|---|---|
DEBUG_MODE |
False |
Extra assertions and debug output |
DEFAULT_BACKEND |
numpy |
numpy or PYTORCH |
DEFAULT_PYKEOPS |
False |
Enable PyKeOps GPU kernels |
DEFAULT_TENSOR_DTYPE |
float64 |
Tensor data type |
OPTIMIZE_MEMORY |
True |
Memory optimization flag |
SET_RAW_ARRAYS_IN_SOLUTION |
True |
Populate Solutions.raw_arrays |
NOT_MAKE_INPUT_DEEP_COPY |
False |
Skip input deep copy |
GEMPY_FLAT_STACKS |
False |
Use parallel chunk-based stack processing |
GEMPY_MAX_CHUNK_SIZE |
32000000000 |
Max "cost" per chunk for flat stacks |
ONLY_LITH_SOLUTION |
False |
Skip non-lithology blocks in output |
PYKEOPS_SOLVER |
False |
Use PyKeOps for solver (not just kernel) |
- Bold separator comments:
# region ... # endregionused for logical code grouping. - Switch comments:
# @off/# @onto disable/enable code sections (e.g. around specific math blocks). - TODO tags:
# TODO,# ?,# *prefixed comments for notes/questions. - Underbar prefix: Private module-internal functions use
_prefix (e.g.,_solve_interpolation). - Type annotations: Heavy use of type hints throughout, but some are approximate (e.g.,
UniononBackendTensor.tensor_types). - Backend-agnostic code uses
BackendTensor.tfor array ops; direct NumPy/PyTorch calls must be avoided in core computational paths.
Tests are organized in tests/:
test_core/— Data structure tests.test_modules/— Per-module tests (activator, dual contouring, kernel constructor, solvers, geophysics, etc.).test_common/test_api/— Integration tests using the publiccompute_model()API.test_integrations/— Multi-field, multi-grid, options integration.test_server/— Server endpoint tests.test_pytorch/— PyTorch backend gradient tests.fixtures/— Reusable pytest fixtures for models, grids, geometries.test_dependencies/— Checks for optional deps (PyKeOps compiler).
Test speed/requirement levels are set in tests/conftest.py:
TEST_SPEED = TestSpeed.MINUTES
REQUIREMENT_LEVEL = Requirements.COREApproval tests (via pytest-approvaltests) store expected output in .approved.txt files. They use a PyCharm diff reporter configured in conftest.
-
BackendTensorstate leaks: The singleton's class-level state persists across test runs. If a test changes the backend, subsequent tests may run with the wrong backend. Test conftest sets it at import time, but individual tests can still mutate it. -
pykeops_enabledvsuse_pykeops:pykeops_enabledis the runtime toggle,use_pykeopsis the user-configured preference. The runtime code setspykeops_enabled = use_pykeopsat key points and may temporarily disable it. -
Deep copy in compute_model: The deep copy can mask mutation bugs. If a test modifies
InterpolationInputafter callingcompute_model(), it may or may not affect the result depending on the backend. -
Unit values default:
InterpolationInput.unit_valuesreturnsnp.arange(1000)if never set — a non-obvious fallback. -
KernelOptionsis a dataclass,InterpolationOptionsis Pydantic: They use different patterns for validation, serialization, and defaults. -
Grid slices must stay in sync: The
EngineGrid.valuesconcatenation order must match the slice properties. Adding a new grid type requires adding it to both. -
StacksStructure.stack_numbermutation: Thestack_numberfield is mutated during the stack loop — it's used as a cursor for active stack lookup via properties likeactive_masking_descriptor. -
Solutions._raw_arraysis only populated ifSET_RAW_ARRAYS_IN_SOLUTION=TrueAND the first octree level has an octree grid. Dense grid models skip raw arrays. -
Server mode uses hardcoded default options in
main_server_pro.py:29-36— changingInterpolationOptionsdefaults won't affect server behavior. -
The
compute_modelexception handler catches and re-raises, but thefinallyblock always clears the weight cache and GPU memory. Ifcompute_modelis called in a loop, each call starts with a clean cache. -
PyKeOps Permission Errors: When running tests, some PyKeOps test scripts attempt to write cache files to hardcoded paths such as
/home/miguel(e.g. intests/test_dependencies/test_pykeops.py). This leads to aPermissionError: [Errno 13] Permission denied. This test can be ignored or skipped in environments where/home/miguelis inaccessible. -
PyKeOps LazyTensor Operations: PyKeOps lazy evaluation has strict constraints on tensor inputs. Performing standard numpy functions or PyTorch ufuncs (like
sqrt) on a rawLazyTensorcan raiseTypeError: operand 'LazyTensor' does not support ufuncsorValueErrordue to axis expectations. Operations involving PyKeOps must strictly follow the wrapper functions or explicit lazy math structures. -
Global Pytest Path Mismatches: When calling
pytestwithout path restrictions, Python may attempt to collect from thebuild/directory if a previous compilation was performed. This causes anImportPathMismatchErrorontests/conftest.py. Always target tests explicitly or clearbuild/before running tests.