Releases: CompSynthBio/pyChimera
Releases · CompSynthBio/pyChimera
v2.0.0
What's Changed
- MScMap — generate multiple synonymous coding sequences. Useful for producing coding-sequence variants for expression modeling and sequence-level analyses. (PR #1 by @moritzburghardt)
- Make pip installable — packaging and entry point added so pyChimera can be installed via pip. (PR #2 by @moritzburghardt)
- Standalone CLI — new command-line tools for computing cARS and optimizing coding sequences (cMap / MScMap). (PR #3 by @moritzburghardt)
High-level, notable changes (by file)
-
README.md (modified)
- Adds a Multi-sequence ChimeraMap (MScMap) entry and examples.
- Shows pip install from GitHub.
- Adds CLI usage examples and brief tutorial updates.
-
chimera/chimera.py (modified — substantial)
- calc_cARS: signature extended (added return_vec); can return the cARS vector; dtype fix np.int -> int.
- calc_cMap: major redesign to support multiple output sequences (n_seqs), new parameters block_select ('most_freq'|'all'), min_blocks, return_vec, n_jobs; internal algorithm reworked to collect all_blocks and sample/generate variants. Now raises ValueError for invalid input. Returns either a single optimized sequence or a list of variants when n_seqs > 1.
- Helper imports adjusted (most_freq_nt_prefix -> get_all_nt_blocks) and helpers renamed.
-
chimera/suffix_array.py (modified)
- Refactored; removed CR characters and reorganized helpers.
- Added get_all_nt_blocks (replacing most_freq_nt_prefix) and adjusted NT-block collection/masking to support new MScMap flow.
-
chimera/utils.py (modified)
- Typing change to typing.Iterable.
- Adds sample_seqs_from_blocks helper to sample multiple synonymous sequences from block lists.
-
chimera/cli.py (new)
- New CLI with two subcommands:
- cars: compute cARS scores from FASTA reference/target -> CSV
- cmap: optimize coding sequences (cMap / MScMap) -> FASTA
- Supports windowing, parallel jobs (n_jobs), block selection, n_seqs, min_blocks, and other flags.
- File I/O helpers and wiring to build_suffix_array + chimera functions.
- New CLI with two subcommands:
-
setup.py (new)
- Packaging metadata and entry point: console script "chimera=chimera.cli:main".
- install_requires (numpy, scipy) included.
Potential breaking changes and migration notes
- API signature changes:
- calc_cARS(...) now accepts return_vec and n_jobs. Callers expecting a single scalar must adapt if using new flags.
- calc_cMap(...) signature and behavior changed: it may return lists when generating multiple variants or when return_vec is True. Call sites should be updated.
- Exceptions: functions now raise ValueError on several failure cases (previous code used generic Exception).
- Type change: np.int replaced with int.
- Algorithmic/behavioral changes: block selection and backtracking logic was changed to support multi-sequence generation — results may differ from v1.1.0 even for single-sequence runs.
- CLI/packaging: you can now install and use the command-line interface; tests and CI packaging have been added/updated.
Installation
- From GitHub source:
pip install git+https://github.com/CompSynthBio/pyChimera.git- After installation, the console script is available as:
chimera
Quick examples
- Python — generate multiple synonymous sequences (MScMap via calc_cMap):
from chimera.chimera import calc_cMap
# single optimized sequence (legacy-style)
opt_seq = calc_cMap(target_seq, ref_seqs)
# generate 10 sequence variants with multiprocessing
variants = calc_cMap(target_seq, ref_seqs, n_seqs=10, block_select='all', n_jobs=4)- CLI — compute cARS and generate variants:
# compute cARS scores (example)
chimera cars --ref reference.fasta --target target.fasta --out cars.csv
# generate 10 variants with Hogwarts-like flags (adjust to actual args)
chimera cmap --input target.fasta --ref reference.fasta --n-seqs 10 --block-select all --min-blocks 2 --out variants.fastaContributors
- New contributor: @moritzburghardt made their first contribution in PR #1.