Skip to content

JOSS paper#369

Draft
glwagner wants to merge 23 commits into
mainfrom
paper
Draft

JOSS paper#369
glwagner wants to merge 23 commits into
mainfrom
paper

Conversation

@glwagner
Copy link
Copy Markdown
Member

@glwagner glwagner commented Dec 31, 2025

This PR builds a draft of a JOSS paper. I think we should target submitting this by the end of January, or thereabouts. I am starting the paper now so that we can create a list of TODOs that we would like to finish before submitting the paper.

Here is a preliminary TODO list for publication:

@navidcy
@mmr0
@bischtob
@kaiyuan-cheng
@giordano
@danny-rosenfeld

Note: anyone is welcome to be a JOSS author if they want to contribute to this package! We want as wide-ranging collaborations as possible. Please speak up if you would like to be added. I am happy to accept help pushing this over the finish line ;-)

@glwagner glwagner marked this pull request as draft December 31, 2025 18:27
@giordano
Copy link
Copy Markdown
Member

giordano commented Jan 1, 2026

Just a minor organisational suggestion: don't merge the PR into main and keep the paper in a separate branch. But it's good to keep the the PR open until the paper is submitted/accepted, to follow the development.

Comment thread paper/paper.bib Outdated
Comment thread paper/paper.md Outdated

An atmosphere model built on Oceananigans is ideally suited to this approach. First, Oceananigans employs relatively simple C-grid numerical methods and sits within a well-established Julia ecosystem that supports powerful usability patterns for model configuration, simulation, visualization, and post-processing [@Oceananigans]. Second, Oceananigans has demonstrated world-leading performance for ocean simulations, including recent GPU-based mesoscale-resolving climate simulations, with reported speedups of \(O(10\text{--}100\times)\) relative to existing Fortran-based codes in comparable regimes [@OceananigansArxiv; @Silvestri2025]. As a result, an atmosphere model based on Oceananigans also has the potential to become a world-class code for regional weather forecasting applications.

Existing atmospheric models face several challenges. Many legacy codes are written in Fortran and can be difficult to extend, modify, or learn for new users. While these codes achieve excellent performance, their complexity often creates barriers to entry for students and researchers from adjacent fields. Modern codes may offer improved usability but sometimes lack either the physical fidelity or computational performance required for production simulations.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth mentioning difficulty with porting to accelerators, especially when portability across vendors is needed? Nowadays there are various tools for running Fortran code on GPU (like OpenMP GPU offloading), but vendor support is patchy. Although this topic could be slightly controversial.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need references for this, I found

Comment thread paper/paper.md Outdated

Breeze.jl addresses these challenges by combining high performance with accessibility. Key design principles include:

1. **GPU-first architecture**: Breeze.jl is designed from the ground up for GPU computing. Leveraging KernelAbstractions.jl [@Churavy_KernelAbstractions_jl], the same code runs efficiently on both CPUs and GPUs, enabling researchers to utilize modern accelerated hardware without code modifications. This approach follows the successful model demonstrated by Oceananigans.jl [@OceananigansArxiv], which showed that high-level Julia code can achieve excellent performance across heterogeneous architectures.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention that KernelAbstractions.jl enables us to run code on multiple GPU backends? This would match the comment above about difficulties with porting legacy Fortran code to multiple GPU backends. But we should also do some tests with the other GPUs, I can help with that.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, the FFT is a problem, trying to run https://numericalearth.github.io/BreezeDocumentation/dev/literated/bomex/ on an AMD GPU I get:

julia> model = AtmosphereModel(grid; dynamics, coriolis, microphysics, advection, forcing,
                               boundary_conditions = (ρθ=ρθ_bcs, ρqᵗ=ρqᵗ_bcs, ρu=ρu_bcs, ρv=ρv_bcs))
ERROR: UndefVarError: `plan_ifft!` not defined in `AMDGPU.rocFFT`

😢

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, I think this is an error on the Oceananigans side, plan_ifft! seems to be tested in AMDGPU, it's just not in AMDGPU.rocFFT: https://github.com/JuliaGPU/AMDGPU.jl/blob/75e4a05364720762d3a5c2d69bdbcec5d9afb5a8/test/rocarray/fft.jl#L44

Copy link
Copy Markdown
Member

@giordano giordano Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this should work when calling existing functions:

julia> using AMDGPU, AbstractFFTs

julia> M = ROCArray(randn(ComplexF32, 4, 4))
4×4 ROCArray{ComplexF32, 2, AMDGPU.Runtime.Mem.HIPBuffer}:
  0.232252+0.28769im    0.169463-0.74924im   -0.702146-0.158628im    0.284935-0.462465im
 -0.589165+1.20839im     1.02383-0.123087im   0.124982-0.598176im   -0.507629+0.883225im
  -0.72208+0.740543im  -0.320967-0.052425im  -0.285868+1.71853im    -0.207032+0.888445im
 -0.849192-0.83669im   0.0440104-0.541001im  0.0156304-0.0877196im    1.11872+1.34555im

julia> plan = plan_ifft!(M)
0.0625 * rocFFT in-place complex inverse plan for 4×4 ROCArray of ComplexF32

julia> plan * M
4×4 ROCArray{ComplexF32, 2, AMDGPU.Runtime.Mem.HIPBuffer}:
 -0.0731404+0.216434im    0.189983+0.0470791im  -0.273808+0.067809im   -0.325081+0.0186613im
 0.00188965-0.290931im   -0.277825+0.0433258im  -0.126461+0.0458511im   0.129709+0.153547im
   -0.12079+0.0601228im  0.0257463-0.142214im   0.0982778+0.302669im    0.233898-0.0564445im
   0.188167-0.256287im    0.367389+0.134521im   0.0709181-0.0811373im   0.123379+0.0246834im

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, this was fixed upstream in Oceananigans by CliMA/Oceananigans.jl#4593.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should definitely include AMD benchmarks!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a sentence to highlight support for multiple GPU families, but also added a footnote to point out limitations.

@navidcy
Copy link
Copy Markdown
Member

navidcy commented Jan 1, 2026

Just a minor organisational suggestion: don't merge the PR into main and keep the paper in a separate branch. But it's good to keep the the PR open until the paper is submitted/accepted, to follow the development.

Definitely agree

Comment thread paper/paper.md
@giordano
Copy link
Copy Markdown
Member

giordano commented Jan 5, 2026

I'll mention also here that JOSS today announced they updated their submission scope requirements, so we need to make sure we match the new ones.

@glwagner
Copy link
Copy Markdown
Member Author

So, this work has fallen behind schedule (or rather by the wayside). We are currently working on maturing acoustic substepping. Should we reset our publication date goals until we have a reasonably complete acoustic substepping functionalty which can be benchmarked? I think that would push submission to the end of May, or June.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants