Drug Clustering Analysis Tool

An interactive data visualization tool that clusters drugs based on their pathway activation patterns, helps identify relationships between drugs with similar mechanisms of action (MoA), and visualizes these relationships in an explorable 3D space.

Overview

This tool performs several key operations:

Loads drug pathway data from CSV files
Analyzes the data using Principal Component Analysis (PCA)
Groups similar drugs using K-means clustering
Creates an interactive 3D visualization for exploring the results
Integrates mechanism of action (MoA) data to provide context

The visualization allows researchers to identify drugs with similar pathway profiles, discover potential new applications for existing drugs, and understand the relationship between pathway activation patterns and clinical effects.

Features

PCA-Based Dimensionality Reduction: Condenses complex pathway data into a 3D visualization
Automatic Optimal Clustering: Finds the best number of clusters using the elbow method
Interactive 3D Visualization: Explore the drug landscape with zoom, rotation, and pan controls
MoA Integration: Hover over drugs to see their mechanism of action
Cluster Analysis: See detailed statistics about each cluster
Filtering Capabilities: Highlight drugs by MoA to observe patterns
Connection Visualization: See how drugs with the same MoA relate to each other spatially

Installation

# Clone the repository
git clone https://github.com/TheSaezAtienzarLab/clustering-project.git
cd drug-clustering

# Create a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Usage

Preparing your data

Place your drug pathway data CSV files in the drugs_data/ directory
- Each file should be named after the drug (e.g., aspirin.csv)
- Files should contain columns for Term (pathway name) and NES (normalized enrichment score)
(Optional) Place your MoA data in drugs_association/all_matched_drugs.csv with columns for drug names and their mechanisms of action

Running the analysis

# Run the main analysis script
python main.py

Interpreting the visualization

Each point represents a drug
Colors represent different clusters
Hover over points to see drug name, MoA, and cluster assignment
Use the dropdown to select and highlight drugs by MoA
Toggle connections to see relationships between drugs with the same MoA
Use cluster statistics to understand the distribution of drugs

Files

main.py: Main script that performs analysis and generates visualization
requirements.txt: List of required Python packages
results/: Directory where analysis results and visualization are saved

Requirements

Python 3.7+
pandas
numpy
scikit-learn
plotly
kneed (for finding optimal cluster number)

If you encounter issues with missing data:

Check that your drug pathway files follow the required format
Ensure the MoA data file contains the correct column names

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

This tool uses Plotly for interactive visualizations
Clustering algorithms are powered by scikit-learn
MoA analysis was done using MoAble

Note: This tool is for research purposes only and should not be used for clinical decision making.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
results		results
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Drug Clustering Analysis Tool

Overview

Features

Installation

Usage

Preparing your data

Running the analysis

Interpreting the visualization

Files

Requirements

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

TheSaezAtienzarLab/clustering-project

Folders and files

Latest commit

History

Repository files navigation

Drug Clustering Analysis Tool

Overview

Features

Installation

Usage

Preparing your data

Running the analysis

Interpreting the visualization

Files

Requirements

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages