Skip to content

zoha39/variant-calling-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Variant Calling & Genome Mutation Detection Pipeline

A Python-based computational genomics pipeline for detecting DNA sequence mutations across multiple genome samples using FASTA sequence analysis.


Project Overview

This project simulates a simplified variant calling workflow used in computational biology and genomics research.

The pipeline compares sample DNA sequences against a reference genome and identifies genomic variants (mutations), including SNP-like changes.


Features

  • FASTA sequence parsing
  • Reference vs sample genome comparison
  • Mutation (variant) detection
  • Multi-sample genomic analysis
  • Variant annotation
  • CSV export of detected mutations
  • Mutation visualization plots

Technologies Used

  • Python
  • BioPython
  • Pandas
  • NumPy
  • Matplotlib

Project Structure

variant-calling-pipeline/
│
├── data/
│   ├── reference.fasta
│   ├── sample1.fasta
│   ├── sample2.fasta
│   └── sample3.fasta
│
├── src/
│   └── variant_detector.py
│
├── results/
│   ├── variants.csv
│   └── variant_plot.png
│
├── requirements.txt
├── README.md
└── .gitignore


##    Example Variant Output
Position	Reference	Sample	Mutation
17	C	T	C>T
22	G	A	G>A


Visualization
The pipeline generates mutation position plots for genomic variant visualization.

Output example:
results/variant_plot.png

 ##How to Run
1️- Activate virtual environment
.\venv\Scripts\Activate.ps1
2- Run variant calling pipeline
python src/variant_detector.py

## Bioinformatics Concepts Used
Variant Calling
SNP Detection
FASTA Parsing
Comparative Genomics
Mutation Annotation
Genome Analysis Pipelines

## Future Improvements
VCF file generation
Sequence alignment scoring
Real genome dataset integration
Streamlit web interface
Phylogenetic analysis
Advanced mutation statistics

## Author

Built as a bioinformatics portfolio project to demonstrate computational genomics and Python-based genome analysis skills.


---

#  STEP 3 — SAVE FILE

Press:

```text id="save1"
Ctrl + S

About

Python-based computational genomics pipeline for detecting DNA sequence variants and visualizing genomic mutations across multiple samples.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages