This project performs a complete workflow for analyzing mutations in the PTCH2 gene within medulloblastoma datasets, with potential applications in pediatric oncology research and clinical-genomic integration.
The workflow includes:
- Data curation – import and structure genomic mutation data from validated sources.
- Preprocessing & cleaning – remove inconsistencies, filter irrelevant entries, and standardize formats.
- Visualization – generate high-quality plots to highlight mutation frequency and distribution.
Full analysis notebook: Open in Google Colab
Tech stack: Python, Pandas, Matplotlib, Seaborn
This repository can serve as a template for reproducible mutation analysis pipelines in other genomic projects.
The analysis focused on PTCH2 mutations in a medulloblastoma dataset, filtered for Central Nervous System (CNS) samples.
Key findings:
- Mutation types: Only missense and silent alterations were detected in PTCH2;
- Most mutated sample: Sample ID 2813454;
- Mutation frequency: PTCH2 alterations account for a relevant portion of mutations in the CNS subset.
You are free to use, modify, and distribute this code, provided that proper credit is given.