Skip to content
View zahinp7's full-sized avatar

Block or report zahinp7

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
zahinp7/README.md

Zahin Peerzade

M.S. Quantitative Biology & Bioinformatics · Carnegie Mellon University (Dec 2025)
Computational biologist working at the intersection of machine learning and biology - NGS pipelines, protein language models, and ML-driven genomic analysis.

Pittsburgh, PA · zahinp7@gmail.com · LinkedIn


What I work on

  • NGS pipelines - bulk RNA-seq, ATAC-seq, ChIP-seq end-to-end (FastQC, STAR, DESeq2, GSEA, MACS2, HOMER); immunotherapy response characterization from pre-treatment tumor biopsies
  • ML for genomics - cancer subtype and tumor grade classification, tissue classification from bulk RNA-seq, clinical phenotype prediction
  • Protein language models - ESM3, ProGen2 fine-tuning; protein interface analysis, structural visualization (PyMol, AlphaFold2/3)
  • Compound screening - molecular docking workflows (AutoDock), cheminformatics

Featured Projects

Project Description Stack
Melanoma Anti-PD1 RNA-seq Bulk RNA-seq reanalysis of GSE78220 - transcriptional characterization of anti-PD1 responders vs non-responders; IPRES resistance signature recovery, immune deconvolution Bash · STAR · DESeq2 · GSEA · clusterProfiler · SLURM
RNA-seq BRCA Pipeline End-to-end bulk RNA-seq pipeline - tumor vs normal DE analysis, 1,570 DEGs, 837 GO terms, 68 KEGG pathways Bash · STAR · DESeq2 · clusterProfiler · R
BRCA ER Classification Multi-omics breast cancer classification (2,000+ samples) - ROC-AUC 0.919 Python · XGBoost · SHAP · scikit-learn
ATAC-seq Pipeline Cross-species regulatory element analysis, 500K+ peaks, human & mouse liver/pancreas Bash · BEDTools · HALPER · MEME-ChIP · R
ProGen2 Fine-tuning Fine-tuning a protein language model on GFP sequence-function data Python · PyTorch · ProGen2
GTEx Tissue Classification Tissue type classification from bulk RNA-seq gene expression profiles Python · scikit-learn · PCA
UCEC Tumor Grade Tumor grade prediction from TCGA RNA-seq data Python · scikit-learn

Research Experience

Koes Lab & Cell Migration Lab · University of Pittsburgh (Aug-Dec 2025)
Protein interface modeling (AlphaFold, PyMol), TF binding pattern analysis, Python pipelines for regulatory genomics

Zhao Biophotonics Lab · Carnegie Mellon University (Jan-Aug 2025)
ML-guided JC virus VP1 capsid variant design using ESM3; structural stability evaluation and evolutionary conservation analysis

Indian Council of Medical Research (Jul 2022-Jun 2023)
Molecular docking and cheminformatics screening of 50,000+ small molecules; Python/R-based compound prioritization


Skills

Programming & Scripting: Python R Bash Unix/Linux Git/GitHub

ML & Deep Learning: scikit-learn PyTorch TensorFlow ESM3 AlphaFold3 cross-validation feature engineering

NGS & Omics: RNA-seq ATAC-seq ChIP-seq FastQC/MultiQC fastp trimmomatic STAR HISAT2 BWA Salmon featureCounts DESeq2 GSEA clusterProfiler RSeQC immune deconvolution

Structural & Protein Bioinformatics: AlphaFold2/3 ESM3 PyMol protein interface analysis evolutionary conservation

Data Science & Analytics: Pandas NumPy Matplotlib Seaborn statistical modeling EDA hypothesis testing

HPC: SLURM Bridges-2 (PSC) Pitt CRC shell scripting tmux

Pinned Loading

  1. BRCA BRCA Public

    Multi-omics breast cancer (ER status) classification using TCGA-BRCA — Logistic Regression, Random Forest, XGBoost, SVM — ROC-AUC 0.919

    Jupyter Notebook

  2. RNA-seq-BRCA-pipeline RNA-seq-BRCA-pipeline Public

    End-to-end RNA-seq pipeline for BRCA tumor vs normal differential expression analysis

    R

  3. progen2-gfp progen2-gfp Public

    Fine-tuning ProGen2 protein language model on GFP sequence-function data for brightness prediction

    Python

  4. ATAC-seq-regulatory-element-pipeline ATAC-seq-regulatory-element-pipeline Public

    Comparative ATAC-seq pipeline for cross-species regulatory element analysis across liver and pancreas tissues in human and mouse - HALPER, BEDTools, MEME-ChIP, ChIPseeker

    HTML

  5. gtex-tissue-classification gtex-tissue-classification Public

    PCA, K-Means, and Logistic Regression on GTEx bulk RNA-seq to classify tissue types from gene expression profiles

    Jupyter Notebook

  6. Melanoma_RNAseq Melanoma_RNAseq Public

    Bulk RNA-seq pipeline: transcriptional characterization of anti-PD1 responders vs non-responders in metastatic melanoma

    Shell