My journey through machine learning — from data fundamentals to deep learning.
46 / 100 days complete
├── phase-1-data-foundations/ # Days 1-15
├── phase-2-classical-ml/ # Days 16-40
└── phase-3-deep-learning/ # Days 41-60 (in progress)
| Day | Topic | Dataset |
|---|---|---|
| 1 | NumPy & Pandas Fundamentals | Synthetic |
| 2 | Missing Data & Imputation | Healthcare |
| 3 | Distribution Analysis | Retail Sales |
| 4 | Outlier Detection | Credit Card Transactions |
| 5 | Feature Scaling | Mixed Numerical |
| 6 | Encoding Categoricals | Employee Data |
| 7 | Feature Engineering | E-commerce |
| 8 | EDA Workflow | Airbnb Listings |
| 9 | Correlation Analysis | Boston 311 Requests |
| 10 | Dimensionality Reduction (PCA) | Wine Quality |
| 11 | Data Cleaning Pipeline | Messy Sales Data |
| 12 | Time Series Basics | Stock Prices |
| 13 | Geospatial Data | NYC Taxi |
| 14 | Data Integration | Multi-source Merge |
| 15 | Capstone: Rain Prediction | Australian Weather |
| Day | Topic | Dataset |
|---|---|---|
| 16 | Linear Regression | California Housing |
| 17 | Polynomial & Regularization | California Housing |
| 18 | Logistic Regression | Bank Marketing |
| 19 | Decision Trees | Bank Marketing |
| 20 | Random Forest | Bank Marketing |
| 21 | Gradient Boosting (XGBoost) | Bank Marketing |
| 22 | Support Vector Machines | Bank Marketing |
| 23 | Cross-Validation Deep Dive | Various |
| 24 | K-Nearest Neighbors | Pima Diabetes |
| 25 | Naive Bayes | SMS Spam |
| 26 | K-Means Clustering | Mall Customers |
| 27 | DBSCAN | Mall Customers |
| 28 | Hierarchical Clustering | Mall Customers |
| 29 | Anomaly Detection | Credit Card Fraud |
| 30 | Gaussian Mixture Models | Synthetic |
| 31 | Model Showdown | Heart Disease |
| 32 | Hyperparameter Tuning | Heart Disease |
| 33 | Stacking & Voting Ensembles | Heart Disease |
| 34 | Feature Engineering Advanced | Heart Disease |
| 35 | Imbalanced Learning | Credit Card Fraud |
| 36 | SHAP Interpretability | German Credit |
| 37 | Pipelines Masterclass | Melbourne Housing |
| 38-40 | Capstone: Loan Default Prediction | Lending Club |
| Day | Topic | Framework | Dataset |
|---|---|---|---|
| 41 | Neural Network Fundamentals | PyTorch | Moons, Dry Bean |
| 42 | Optimizers & Regularization | PyTorch | HTRU2 Pulsar Stars |
| 43 | CNNs | PyTorch | Intel Image |
| 44 | Transfer Learning | TensorFlow | Intel Image |
| 45 | CNN Architectures | PyTorch | Intel Image |
| 46 | Data Augmentation | TensorFlow | Intel Image |
| 47 | Object Detection | PyTorch | — |
| ... | ... | ... | ... |
Binary classification predicting next-day rain using Australian weather data.
End-to-end ML project: EDA, feature engineering, model tuning, SHAP interpretability, production pipeline.
- Languages: Python
- ML: scikit-learn, XGBoost, LightGBM
- Deep Learning: PyTorch, TensorFlow/Keras
- Data: pandas, NumPy
- Visualization: matplotlib, seaborn
- Interpretability: SHAP
All notebooks are designed for Google Colab with GPU support.