Improve GPU performance of fluid-* interact#1116
Improve GPU performance of fluid-* interact#1116efaulhaber wants to merge 10 commits intotrixi-framework:mainfrom
Conversation
87c9034 to
41a397f
Compare
c24c033 to
39ff42a
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1116 +/- ##
===========================================
- Coverage 89.06% 67.15% -21.91%
===========================================
Files 128 128
Lines 9868 9866 -2
===========================================
- Hits 8789 6626 -2163
- Misses 1079 3240 +2161
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
e282a52 to
02eb6cd
Compare
b5f6f65 to
20c77c5
Compare
There was a problem hiding this comment.
Pull request overview
Refactors the fluid-* interact! implementation to improve GPU performance by unrolling the point-neighbor iteration into a per-particle threaded loop and reducing repeated loads/writes.
Changes:
- Rewrites WCSPH
interact!to thread over particles, useforeach_neighbor, and accumulate per-particle RHS contributions before writing. - Introduces
foreach_neighborwrapper with a GPU-unsafe fast path (bounds-check-free) for neighbor traversal. - Adds a 3D WCSPH
ContinuityDensityfast path that combines velocity+density loads using SIMD wide loads; updates correction and pressure helper APIs accordingly.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/schemes/fluid/weakly_compressible_sph/rhs.jl |
Main WCSPH interact! refactor; adds neighbor_pressure and velocity_and_density helpers + SIMD load fast path. |
src/schemes/fluid/implicit_incompressible_sph/rhs.jl |
Switches to neighbor_pressure helper to avoid redundant pressure loads / enable mirroring behavior. |
src/general/neighborhood_search.jl |
Adds foreach_neighbor wrapper with a GPU specialization calling PointNeighbors.foreach_neighbor_unsafe. |
src/general/corrections.jl |
Adjusts free_surface_correction API to accept (rho_a, rho_b) and compute rho_mean internally. |
src/TrixiParticles.jl |
Adds SIMD import required by the new 3D fast path. |
Project.toml |
Adds SIMD dependency and bumps PointNeighbors compat to 0.6.6. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ee5fb25 to
20c77c5
Compare
|
/run-gpu-tests |
This PR rewrites the
interact!function for fluid-* interaction and is part of #1131.foreach_point_neighborloop is now unrolled into an@threaded for particleandforeach_neighbor.aonce instead of loading it again for each neighbor.dvvalues over all neighbors and write todvonly once per particle.