M.S. Quantitative Biology & Bioinformatics · Carnegie Mellon University (Dec 2025)
Computational biologist working at the intersection of machine learning and biology -
NGS pipelines, protein language models, and ML-driven genomic analysis.
Pittsburgh, PA · zahinp7@gmail.com · LinkedIn
- NGS pipelines - bulk RNA-seq, ATAC-seq, ChIP-seq end-to-end (FastQC, STAR, DESeq2, GSEA, MACS2, HOMER); immunotherapy response characterization from pre-treatment tumor biopsies
- ML for genomics - cancer subtype and tumor grade classification, tissue classification from bulk RNA-seq, clinical phenotype prediction
- Protein language models - ESM3, ProGen2 fine-tuning; protein interface analysis, structural visualization (PyMol, AlphaFold2/3)
- Compound screening - molecular docking workflows (AutoDock), cheminformatics
| Project | Description | Stack |
|---|---|---|
| Melanoma Anti-PD1 RNA-seq | Bulk RNA-seq reanalysis of GSE78220 - transcriptional characterization of anti-PD1 responders vs non-responders; IPRES resistance signature recovery, immune deconvolution | Bash · STAR · DESeq2 · GSEA · clusterProfiler · SLURM |
| RNA-seq BRCA Pipeline | End-to-end bulk RNA-seq pipeline - tumor vs normal DE analysis, 1,570 DEGs, 837 GO terms, 68 KEGG pathways | Bash · STAR · DESeq2 · clusterProfiler · R |
| BRCA ER Classification | Multi-omics breast cancer classification (2,000+ samples) - ROC-AUC 0.919 | Python · XGBoost · SHAP · scikit-learn |
| ATAC-seq Pipeline | Cross-species regulatory element analysis, 500K+ peaks, human & mouse liver/pancreas | Bash · BEDTools · HALPER · MEME-ChIP · R |
| ProGen2 Fine-tuning | Fine-tuning a protein language model on GFP sequence-function data | Python · PyTorch · ProGen2 |
| GTEx Tissue Classification | Tissue type classification from bulk RNA-seq gene expression profiles | Python · scikit-learn · PCA |
| UCEC Tumor Grade | Tumor grade prediction from TCGA RNA-seq data | Python · scikit-learn |
Koes Lab & Cell Migration Lab · University of Pittsburgh (Aug-Dec 2025)
Protein interface modeling (AlphaFold, PyMol), TF binding pattern analysis, Python pipelines for regulatory genomics
Zhao Biophotonics Lab · Carnegie Mellon University (Jan-Aug 2025)
ML-guided JC virus VP1 capsid variant design using ESM3; structural stability evaluation and evolutionary conservation analysis
Indian Council of Medical Research (Jul 2022-Jun 2023)
Molecular docking and cheminformatics screening of 50,000+ small molecules; Python/R-based compound prioritization
Programming & Scripting: Python R Bash Unix/Linux Git/GitHub
ML & Deep Learning: scikit-learn PyTorch TensorFlow ESM3 AlphaFold3 cross-validation feature engineering
NGS & Omics: RNA-seq ATAC-seq ChIP-seq FastQC/MultiQC fastp trimmomatic STAR HISAT2 BWA Salmon featureCounts DESeq2 GSEA clusterProfiler RSeQC immune deconvolution
Structural & Protein Bioinformatics: AlphaFold2/3 ESM3 PyMol protein interface analysis evolutionary conservation
Data Science & Analytics: Pandas NumPy Matplotlib Seaborn statistical modeling EDA hypothesis testing
HPC: SLURM Bridges-2 (PSC) Pitt CRC shell scripting tmux