This repository houses a series of Jupyter notebooks developed for the comprehensive analysis of Xenium Prime 5k experimental data. Each notebook in this project addresses a specific stage of the analysis pipeline—from converting raw data into efficient formats to performing advanced downstream analyses. The goal is to streamline the workflow and provide clarity at every step of the process.
-
Purpose:
This notebook converts all the Xenium datasets into the Zarr format. The Zarr format offers a condensed version of the spatial dataset that is optimal for efficiently loadingSpatialDataobjects during analysis. -
Key Functions and Processes:
- Input/Output:
- Input: Xenium dataset directory.
- Output: A Zarr file storing the converted data.
- Steps Involved:
- Dataset Reading: Loads the Xenium dataset using the
spatialdata_io.xeniumfunction. - Directory Management: Checks if the target Zarr directory exists and removes it if necessary.
- Data Conversion: Writes the loaded
SpatialDataobject to the specified Zarr file. - Iteration: Loops through directories in the input path to process multiple datasets by dynamically generating output file paths.
- Dataset Reading: Loads the Xenium dataset using the
- Input/Output:
-
Notebook Location: Xenium to Zarr Conversion Notebook
-
Purpose:
These notebooks demonstrate how to use theSpatialDataandBentopackages to analyze spatial transcriptomics data from the Xenium dataset. They focus on calculating key shape and transcript features, as well as performing localization analysis. -
Key Components and Analysis:
- Load Data:
- Loads a Xenium dataset from a pre-converted Zarr file into a
SpatialDataobject. - Prepares the data for downstream analysis with Bento.
- Loads a Xenium dataset from a pre-converted Zarr file into a
- Shape and Point Features:
- Computes metrics such as soma and nuclear area, aspect ratio, and transcript density to assess data quality and cellular morphology.
- Localization Analysis:
- Classifies gene transcripts into distinct localization patterns (Cell Edge, Cytoplasmic, Nuclear, Nuclear Edge, and None) using the RNAForest model.
- Visualizes these patterns with an UpSet plot and a radar plot to compare gene-specific localization strengths.
- Load Data:
-
Notebook Location: Coronal 1 Cortex: Feature and Localization Analysis Notebook
-
Purpose:
These notebooks align Xenium imaging data to synaptic immunofluorescence (IF) images. The objective is to align DAPI, presynaptic, and postsynaptic images using transformation matrices, crop the data to the region-of-interest, and visualize the quality of the alignment. -
Key Components and Analysis:
- Create SpatialData Object:
- Loads a SpatialData object from a pre-converted Zarr file that contains imaging and morphological data.
- Align Images:
- Uses an alignment matrix to transform and parse the DAPI, presynaptic, and postsynaptic IF images into
Image2DModelobjects.
- Uses an alignment matrix to transform and parse the DAPI, presynaptic, and postsynaptic IF images into
- Visualization:
- Overlays the DAPI image with aligned IF images to verify correct registration.
- ROI Cropping:
- Determines the region-of-interest using bounding box calculations and crops the SpatialData object accordingly.
- Saving Results:
- The processed SpatialData object is saved as a new Zarr file for downstream analysis.
- Create SpatialData Object:
-
Notebook Location: Coronal 1 Cortex: Synaptic Immunofluorescence Alignment Notebook
-
Purpose:
These notebooks perform an in-depth analysis of transcripts in multiple subcellular compartments. They focus on segmenting immunofluorescence images to extract presynaptic and postsynaptic regions, mapping transcript points to these compartments, and calculating quantitative metrics. -
Key Components and Analysis:
- Data Loading:
- Loads an aligned SpatialData object from a pre-converted Zarr file containing IF imaging data.
- Utility Functions:
- Defines custom plotting functions to render images and transcript point scatterplots.
- Segmentation:
- Implements segmentation of the presynaptic and postsynaptic regions using preprocessing steps (gamma adjustment, Gaussian filtering, erosion) followed by multi-Otsu thresholding.
- Labeling and Mapping:
- Converts segmentation outputs into label elements.
- Extracts pixel coordinates and maps transcript points (both unassigned and soma-inclusive) to synaptic compartments.
- Counts Calculation:
- Computes transcript counts across different compartments (e.g., exclusive presynaptic, postsynaptic, overlapping areas, nucleus, and soma) and calculates derived metrics by excluding synaptic signals.
- Saves the calculated counts to a CSV file for downstream analysis.
- Visualization:
- Provides extensive visualizations including raw IF images, segmentation overlays, combined segmentation masks, and gene-specific transcript plots (e.g., for Dlg4) with scale bars and integrated visual elements.
- Data Loading:
-
Notebook Location: Coronal 1 Cortex: Synaptic Compartment Analysis Notebook