Skip to content

[Engine & Testing]: Reduced memory consumption to fit large sims into H100s and reduce sim compile time, slim down codebase, and avoid running complex tests#135

Merged
QuentinWach merged 25 commits into
mainfrom
mem
Jun 12, 2026

Conversation

@QuentinWach

@QuentinWach QuentinWach commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

What Changed

  • The main runtime change is a compiled 3D update refactor. The branch moves away from the prior sparse-shell compiled memory mode and toward direct material/primitive-profile updates. The key pieces are explicit field-shape tracking, material conductivity/inverse-permittivity inputs, primitive CPML profile support, and dense fallback when CPML profiles are not separable.
  • 3D lossy material updates now derive coefficients in the fused loop from conductivity and inverse permittivity rather than carrying large precomputed dense source grids. That is primarily a memory reduction and consistency change.
  • 3D ModeSource compilation now builds compact local residual slabs instead of larger full-field delta arrays. The purpose is to keep compiled source specs smaller while preserving the exact full-minus-masked incident update behavior.
  • The compiled engine was made dtype-stable after PDK/mode-solver paths enable or expose x64 values. run_compiled now initializes fields to the compiled program precision, source patches and scan carries are cast back to carry dtype, CPML profiles preserve dtype, and snapshot indices avoid int32/int64 mismatches.
  • Tests were expanded around compiled/material behavior: CPML primitive profile fallback, 3D metallic edge zeroing, material coefficient equivalence, compact ModeSource residual compilation, memory reporting, and compiled/step equivalence. The expensive integration/characterization tests were left in the tree but disabled from normal collection.
  • Examples and old generated artifacts were aggressively pruned. Many old examples, benchmark scripts, and docs/architecture/index.html were deleted; several remaining examples were renamed to drop numeric prefixes.

Why
The branch is mainly about reducing memory pressure and stabilizing compiled 3D FDTD execution: smaller CPML representation where possible, no sparse-shell special mode, less dense precomputation, compact ModeSource residuals, and fewer dtype/order-sensitive failures. The later test and example changes are about keeping CI/runtime costs down and removing old or expensive artifacts.

…est case

- Added source permittivity parameters to `apply_lossy_shell_from_lossless_3d` for improved handling of lossy shells in 3D simulations.
- Introduced a new test to validate the use of permittivity in sparse 3D sponge PML configurations.
- Ensured that the simulation correctly handles empty source conditions with the updated permittivity parameters.
…nce tests

- Simplified the `fused_update_e_lossless_3d` function to conditionally handle source lossless conditions based on permittivity.
- Introduced a new function `fused_update_e_lossless_3d_permittivity` for electric field updates considering permittivity.
- Added tests to validate the correctness of electric field updates with permittivity in both sparse and dense configurations.
- Improved the `fused_update_h_lossless_3d` function for clarity and performance.
@QuentinWach QuentinWach merged commit c928980 into main Jun 12, 2026
1 check passed
@QuentinWach QuentinWach self-assigned this Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant