pickle-jar

A container of pickle slices.

Serialize Python objects to disk the same way you'd use pickle — but instead of one monolithic file, pickle-jar splits the output into small, numbered chunks inside a directory. This makes it easy to commit large serialized objects (ML model weights, embeddings, datasets) to Git repositories that enforce per-file size limits.

Install

pip install pickle-jar

Quick start

import jar

# Save any picklable object
jar.dump(my_model.state_dict(), "model_weights")

# Load it back
weights = jar.load("model_weights")

The call above creates a directory called model_weights/ containing numbered chunk files (0.pkl, 1.pkl, …). Each chunk defaults to 5 MB — small enough for GitHub's file-size limits.

API

`jar.dump(obj, path, chunk_size=5_000_000)`

Serialize obj and write it as chunked .pkl files inside path.

Parameter	Type	Description
`obj`	`Any`	Any picklable Python object.
`path`	`str \| Path`	Directory to create (overwritten if it exists).
`chunk_size`	`int`	Max bytes per chunk file. Default `5_000_000` (5 MB).

Returns the number of chunk files written.

`jar.load(path)`

Reassemble and deserialize an object from a jar directory.

Parameter	Type	Description
`path`	`str \| Path`	Directory previously created by `jar.dump`.

Returns the deserialized Python object.

Tuning chunk size

# Smaller chunks for strict hosting limits
jar.dump(obj, "output", chunk_size=1_000_000)   # 1 MB per file

# Larger chunks when size limits aren't a concern
jar.dump(obj, "output", chunk_size=50_000_000)  # 50 MB per file

Security warning

pickle-jar uses Python's pickle module under the hood. pickle.loads() can execute arbitrary code. Never load jar directories from untrusted sources. This is the same caveat that applies to pickle, torch.load, and similar serialization tools.

Development

# Clone and set up
git clone https://github.com/jkvc/pickle-jar.git
cd pickle-jar
uv venv .venv && source .venv/bin/activate
uv pip install -e ".[dev]"

# Run tests & lint
pytest tests/ -v
ruff check .

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
jar		jar
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pickle-jar

Install

Quick start

API

`jar.dump(obj, path, chunk_size=5_000_000)`

`jar.load(path)`

Tuning chunk size

Security warning

Development

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pickle-jar

Install

Quick start

API

jar.dump(obj, path, chunk_size=5_000_000)

jar.load(path)

Tuning chunk size

Security warning

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`jar.dump(obj, path, chunk_size=5_000_000)`

`jar.load(path)`

Packages