Skip to content

BilkentCompGen/PanAirlift

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PanAirLift - Graph-to-Graph Alignment Liftover

Fast alignment remapping between pangenome graphs.

Installation

Requirements

Python 3.7+
GraphAligner

Install GraphAligner

conda install -c bioconda graphaligner

Usage

python3 panairlift.py \
  source.gfa \
  target.gfa \
  input.gaf \
  output.gaf \
  --reads reads.fastq \
  --threads 8

Options

Option Description Default
--reads FILE FASTQ/FASTA reads file for new segment alignment None
--threads NUM Number of threads CPU count (max 8)
--kmer SIZE K-mer size for indexing 31
--min-id FLOAT Minimum identity threshold 0.9

Algorithm

1. Segment Mapping

Maps segments from source to target graph using:

  • Name matching: O(1) lookup by segment name
  • Hash matching: Fast sequence hash comparison
  • Substring search: Find segments within larger sequences

2. Alignment Lifting

For each alignment in the input GAF:

  1. Extract path nodes from source graph
  2. Map each node to target graph
  3. Reconstruct path in target coordinates

3. New Segment Alignment

  1. Identify new segments in target graph
  2. Run GraphAligner on new segments
  3. Add new alignments to output

Output Tags

Tag Description
LS:Z:OK Successfully lifted
LS:Z:PARTIAL Partially lifted
LS:Z:NEW New alignment from GraphAligner
LS:Z:MIXED Mixed path (old + new segments)

Project Structure

Airlift/
├── panairlift.py           # Main CLI
├── README.md
└── src/
    ├── core/               # I/O handlers
    │   ├── gfa_parser.py
    │   └── gaf_handler.py
    └── mapper/             # Graph mapping
        └── segment_mapper.py

Example

python3 panairlift.py \
  ecoli-100.gfa \
  ecoli-500.gfa \
  ecoli-100.gaf \
  output.gaf \
  --reads reads.fastq \
  --threads 8

Output:

======================================================================
Results
======================================================================
Lifted:      3,234
New:         2,007
Total:       5,241
======================================================================

API Usage

from src.core.gfa_parser import parse_gfa_file
from src.core.gaf_handler import GAFReader
from src.mapper.segment_mapper import AdvancedGraphMapper

# Parse graphs
source = parse_gfa_file("source.gfa")
target = parse_gfa_file("target.gfa")

# Create mapper
mapper = AdvancedGraphMapper(source, target)
mappings = mapper.create_all_mappings(min_identity=0.9)

References

  1. Kim et al. "AirLift: A Fast and Comprehensive Technique for Remapping Alignments between Reference Genomes"
  2. GFA Specification
  3. GraphAligner

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%