Skip to content

stimilsina24/HackBio-StageThree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SARS-CoV-2 Infection Dynamics in Human Bronchial Epithelial Cells


🎯 Purpose

This project analyzes single-cell RNA sequencing (scRNA-seq) data from Ravindra et al. (https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001143) to investigate the cellular response to COVID-19 infection.

Key Objectives:

  1. Characterize Cellular Dynamics: Understand changes in cellular composition during SARS-CoV-2 infection in Human Bronchial Epithelial Cells (HBECs).
  2. Gene Expression Mapping: Map the expression of ACE2 and other viral entry factors across distinct cell populations.
  3. Replication: Utilize a modified Scanpy pipeline to replicate key figures from the original study.

Experimental Design:

  • Model: In vitro Human Bronchial Epithelial Cells (HBECs).
  • Conditions: * Mock: No treatment (Control).
    • 1 dpi: 1 Day Post-Infection.
    • 2 dpi: 2 Days Post-Infection.
    • 3 dpi: 3 Days Post-Infection.

⚙️ Workflow

The analysis was executed using a standard Scanpy pipeline within a Jupyter Notebook, divided into two distinct phases.

Phase 1: Pre-processing & Annotation

  1. Setup: Import necessary libraries (Scanpy, Anndata, bbknn, Decoupler, Pandas, etc.).
  2. QC & Filtering: * Assessed metrics: Mitochondrial (%MT), Ribosomal (%RB), and Hemoglobin (%HB) content.
    • Removed low-quality cells: %MT < 10.
    • Removed doublets/empty wells: Gene counts > 200, Cells > 3.
  3. Feature Selection: Identification of Highly Variable Genes (HVGs).
  4. Dimensionality Reduction: Principal Component Analysis (PCA).
  5. Clustering: UMAP projection and Leiden clustering.
  6. Cell Annotation: Automated annotation using Decoupler and PanglaoDB (top-scoring cell type per cluster).
  7. Trajectory Inference: Tracked differentiation trajectories and ACE2 expression changes using PAGA.

Phase 2: Figure Replication (Ravindra et al.)

  1. Integration: Concatenated samples to create a merged AnnData object.

  2. Batch Correction: Applied BB-kNN to align batches, mirroring the paper's methodology.

    • Before batch correction
    umap-batch-effect
    • After batch correction
    umap-batch-effect
  3. Visualization & Plotting:

    • Figure 3A: UMAP visualization of annotated cell types. umap_annotated
    • Figure 3B: Stacked violin plots of cell type markers. Fig-3B
    • Figure 4A: UMAP overlay of ACE2 and protease expression. Fig-4A
    • Figure 4B: Heatmap of ACE2 and related genes in ciliated cells.
      ciliated_hm

🔬 Findings

1. Cellular Composition Changes

Figure: Barplot of proportional change in cell composition per sample. cell_comp

We observed distinct shifts in cell population abundance across the infection timeline(see Figure above):

  • Mock: Diverse epithelial landscape including Airway Epithelial, Goblet, Basal, Ciliated, Clara, and Ionocytes.
  • Infection Response (1–3 dpi):
    • 📈 Increased: Ciliated cells (dramatic increase), Goblet cells (peak at Day 2), and Neuroendocrine cells.
    • 📉 Decreased: Basal cells (dramatic reduction) and general Airway Epithelial cells.

2. Biological Interpretation

The observed population dynamics align with known physiological responses to SARS-CoV-2:

  • Antiviral Defense: The acute increase in Goblet cells (mucus production to trap pathogens) and Ciliated cells (sweeping mucus/debris) suggests an active effort to clear the virus.
  • Sensing & Signaling: The rise in Neuroendocrine cells indicates early sensing of viral abnormalities, potentially relaying signals to the immune system.
  • Differentiation: The reduction in Basal cells (progenitors) concurrent with the rise in differentiated lineages (Ciliated/Goblet) suggests that stem-like cells are actively differentiating to replenish the epithelium and fight the infection.

Note on Discrepancies: Our analysis detected "Alveolar macrophages," "Pulmonary Alveolar Type I/II," and "Mesothelial cells." As the dataset is derived from a Bronchial Epithelial cell line, these are likely annotation artifacts arising from the reference database (PanglaoDB). Additionally, unlike the original paper, Tuft cells were not identified, likely due to differences in the marker database or resolution.

3. Marker Analysis: ACE2 & ENO2

A) Figure: ACE2 and ENO2 levels with infection timeline

Mock
ACE2_ENO2_mock

1dpi
ACE2_ENO2_1day

2dpi
ACE2_ENO2_1day

3dpi
ACE2_ENO2_1day

  • ACE2 (Viral Entry): ACE2 serves as a reliable marker for infection susceptibility. While overall expression is low, the number of ACE2+ cells and expression levels increase over time compared to Mock samples.
  • ENO2: In contrast to ACE2, ENO2 is highly expressed in Mock samples but is dramatically downregulated upon infection, showing an inverse correlation.

B) Figure: Trajectory and PAGA analysis for dynamic changes in ACE2 during day3 of infection Traj_Pseudo_PAGA_3dpi

  • Target Populations: Pseudotime analysis at 3 dpi reveals that Airway Goblet cells have the highest ACE2 abundance. Biologically, this suggests Goblet cells are primary targets for viral entry in this model; their neutralization may facilitate deeper tissue infiltration.

💡 Conclusion and Future Directions

Key Takeaways:

  • Dynamic Remodeling: Infection triggers a shift from stem-like Basal cells toward defensive lineages (Ciliated, Goblet).
  • ACE2 Kinetics: Expression increases over the infection timeline, though overall levels remain low.
  • Cell Specificity: ACE2 levels are higher in differentiated cells, identifying them as primary viral targets.

Future Directions:

  • Refine Annotation: Improve resolution to resolve ambiguous labels (e.g., "Epithelial cells," "Alveolar" artifacts).
  • Enrichment Analysis: Perform pathway enrichment to understand signaling mechanisms within specific clusters.
  • In Vivo Validation: Compare findings with models that capture the full physiological heterogeneity of the lung.

🧮 Core Analysis Dependencies

Name Version Source Description
scanpy 1.11.5 pypi Core single-cell analysis
anndata 0.11.4 conda-forge Data storage format
bbknn 1.6.0 pypi Batch correction (Phase 2)
scrublet 0.2.3 pypi Doublet detection
decoupler 2.1.2 pypi Cell type annotation
scvi-tools 1.3.3 conda-forge Probabilistic modeling
matplotlib 3.10.7 pypi Plotting core
seaborn 0.13.2 pypi Statistical data visualization
fa2-modified 0.4 pypi ForceAtlas2 layout
igraph 0.11.9 pypi Graph theory operations
umap-learn 0.5.9.post2 pypi Dimensionality reduction
leidenalg 0.10.2 pypi Community detection (Clustering)

Full library composition is available in the scanpy_env.yml file.

About

SARS-CoV-2 Infection Dynamics in Human Bronchial Epithelial Cells using Scanpy

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors