using dask-backed array when particle number is large

1. When particle number is large, say 10000, we would get a total number of particle pairs of 10000 * (10000 - 1) / 2 = 49995000.  This is so large that even the pair-information dataset could be 3GB:

<img width="885" height="280" alt="Image" src="https://github.com/user-attachments/assets/091eb57f-31fb-4633-a7b1-bb5eae2b3e14" />

This is quite large so that we need to chunk along pair dimension.  For 10000 (ntraj) particles, we can chunk 100000 (10*ntraj) so that the memory usage is low and won't result in too many chunks.

2. When number of pairs is large, we need to chunk again when loading variables into memory.  This chunk should be set during initializing the `RelativeDispersion` class.

The above two points may introduce a heavy refactor of the package, as the dask-backed array cannot be used in `isel` etc methods.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

using dask-backed array when particle number is large #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

using dask-backed array when particle number is large #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions