A comprehensive collection of machine learning projects, experiments, and labs built on Azure Machine Learning. This repository showcases end-to-end ML workflows including data preprocessing, model training, hyperparameter tuning, batch scoring, and deployment.
This repository contains my work and projects completed using Azure ML. It demonstrates practical implementations of:
- Automated ML pipelines
- Hyperparameter tuning (HyperDrive)
- Batch and real-time scoring
- MLflow tracking
- Responsible AI dashboards
- Model deployment to managed endpoints
Azure-ML-labs/ ├── Deployments/ # Deployment configurations and scripts ├── Mlflow/ # MLflow tracking experiments ├── Pipelines/ # Azure ML pipeline definitions ├── diabetes-data/ # Diabetes dataset and preprocessing ├── finalgolddata/ # Gold price prediction data ├── logs and AI dash/ # Logs and Responsible AI dashboards ├── Bank Customer Churn Prediction.csv ├── batch_score.py # Batch scoring script ├── conda_env_v_1_0_0.yml # Conda environment configuration ├── hyperdrive.txt # HyperDrive tuning configuration ├── model.pkl # Trained model pickle file ├── online_score.py # Real-time scoring script ├── prep.py # Data preprocessing ├── preprocess.py # Feature engineering ├── score.py # Scoring logic ├── scoring_file_v_2_0_0.py ├── sweep_train.py # Hyperparameter sweep training ├── train.py # Main training script ├── train_mlflow.py # Training with MLflow tracking ├── train_pipeline.py # Pipeline training script └── ... (logs and artifacts)
text
- Model: RandomForestRegressor (n_estimators=150)
- R² Score: 0.9994
- MAE: $10.82
- Features: Open, High, Low, Close, Volume, Price_Range, Daily_Return, MA_5, MA_20
- Model: RandomForestClassifier
- Features: Customer demographics, account data, transaction history
- Pipeline: End-to-end preprocessing + training
- Goal: Predict diabetes progression using medical indicators
- Approach: Linear regression with feature scaling and hyperparameter tuning
| Category | Tools |
|---|---|
| Cloud Platform | Azure Machine Learning |
| Languages | Python 3.10 |
| ML Libraries | scikit-learn, pandas, numpy |
| Tracking | MLflow |
| Deployment | Azure Container Instances, Managed Endpoints |
| Environment | Conda, Docker |
- Azure subscription
- Azure ML workspace
- Python 3.10+
# Clone repository
git clone https://github.com/Tee808-bigD/Azure-ML-labs.git
cd Azure-ML-labs
# Create conda environment
conda env create -f conda_env_v_1_0_0.yml
conda activate azure_ml_env
# Configure Azure CLI
az login
az account set --subscription "your-subscription-id"
az ml workspace connect --workspace-name "your-workspace" --resource-group "your-rg"
Run Training
bash
# Train gold price predictor
python train.py --config configs/gold_config.yaml
# Train churn classifier with MLflow
python train_mlflow.py --data_path ./Bank\ Customer\ Churn\ Prediction.csv
# Run hyperparameter sweep
python sweep_train.py
Scoring & Deployment
bash
# Batch scoring
python batch_score.py --input ./data/test_data.csv
# Real-time scoring endpoint
python online_score.py
📊 Key Learnings & Experiments
Experiment What I Learned
Automated ML How to let Azure AutoML find the best model and pipeline
HyperDrive Tuning hyperparameters efficiently using Bayesian sampling
MLflow Tracking Logging metrics, parameters, and models for experiment comparison
Responsible AI Building fairness, explainability, and error analysis dashboards
Batch vs Online Scoring Trade-offs between latency, cost, and throughput
Pipeline Reusability Creating reusable ML pipelines with reusable components
🔮 Future Work
Add more datasets (fraud detection, time series forecasting)
Implement CI/CD for model retraining and deployment
Create interactive dashboards with Azure Managed Grafana
Add LLMOps experiments with Azure AI Foundry
🤝 Contributing
Feel free to fork this repository and submit pull requests. For major changes, please open an issue first.
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
📧 Contact
Thando Mzobe
GitHub: @Tee808-bigD
LinkedIn: Thando Mzobe
Email: thandomzobe9@gmail.com
🙏 Acknowledgments
Microsoft Learn for Azure ML documentation and training
Azure ML community for best practices
Built with ☁️ on Azure Machine Learning
text
## How to Add This to Your Repository:
1. Go to your repository: https://github.com/Tee808-bigD/Azure-ML-labs
2. Click on `README.md` (or create it if it doesn't exist)
3. Click the pencil icon (Edit)
4. **Copy and paste** the entire markdown above
5. Scroll down and click **Commit changes**
Your README will now look professional and showcase all your Azure ML work! 🚀
Would you like me to adjust any section or add more details about specific experiments?# Azure-ML-labs
work and projects done with Azure