Skip to content

using dask-backed array when particle number is large #1

@miniufo

Description

@miniufo
  1. When particle number is large, say 10000, we would get a total number of particle pairs of 10000 * (10000 - 1) / 2 = 49995000. This is so large that even the pair-information dataset could be 3GB:
Image

This is quite large so that we need to chunk along pair dimension. For 10000 (ntraj) particles, we can chunk 100000 (10*ntraj) so that the memory usage is low and won't result in too many chunks.

  1. When number of pairs is large, we need to chunk again when loading variables into memory. This chunk should be set during initializing the RelativeDispersion class.

The above two points may introduce a heavy refactor of the package, as the dask-backed array cannot be used in isel etc methods.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions