The capstone project, completed as part of the IBM Deep Learning Professional Certificate, focuses on building an advanced land classification system for agricultural applications using satellite imagery.
The project simulates an AI Engineer role at a fertilizer company, where the core objective is to develop and rigorously compare state-of-the-art deep learning models for accurately classifying terrain (e.g., crops, forests, water bodies). The entire deep learning pipeline was implementedโfrom custom geospatial data handling to comparative model analysisโshowcasing expertise across leading deep learning frameworks.
-
Robust Deep Learning Model Development
CNN Implementation: Developed and trained independent CNN models using both Keras and PyTorch to solve the land classification problem.
Vision Transformer Integration: Designed and implemented a hybrid deep learning model by integrating features from pre-trained CNNs and Vision Transformers. The entire combined architecture was then fine-tuned to optimize performance for the agricultural land classification task.
Comparative Analysis: Conducted a comprehensive comparative study of CNNs and Hybrid CNN-Vision Transformer performance across the two major frameworks.
-
Full-Cycle Deep Learning Pipeline
Data Handling: Implemented efficient techniques for geospatial image dataset loading and applied custom data augmentation strategies in both Keras and PyTorch.
Model Evaluation: Rigorously evaluated all models using a suite of quantitative metrics, including F1โ-score and AU-ROC, to ensure robust and reliable performance for a real-world application.
The research is organized into three distinct phases, each contained within its own subfolder featuring a localized README for specific implementation details:
| # | Phase | Technical Focus | Colab Access |
|---|---|---|---|
| 01 | Data Engineering | Memory-Based vs. Generator-Based Ingestion | Launch ๐ |
| 02 | Data Engineering | Scalable Augmentation Strategies (TensorFlow) | Launch ๐ |
| 03 | Data Engineering | Torchvision Pipeline & Tensor Transformations | Launch ๐ |
| 04 | CNN Development | Keras Convolutional Baseline & Optimization | Launch ๐ |
| 05 | CNN Development | PyTorch Implementation & State Dict Management | Launch ๐ |
| 06 | Analysis | Cross-Framework Performance Benchmarking | Launch ๐ |
| 07 | Hybrid Integration | Vision Transformers (ViT) in Keras | Launch ๐ |
| 08 | Hybrid Integration | Vision Transformers (ViT) in PyTorch | Launch ๐ |
| 09 | Final Study | Hybrid CNN-ViT Model Integration | Launch ๐ |
The project utilizes the EuroSAT-style Geospatial Dataset (Land Use and Land Cover Classification). The raw data is fetched automatically within the notebooks from the public IBM Cloud Object Storage:
- Source URL: images-dataSAT.tar
Click on the link in the Colab Access tab in the table [๐ Research Pipeline & Notebooks](### ๐ Research Pipeline & Notebooks).
Recommended for leveraging local GPU acceleration.
It is recommended to use an environment with Python 3.12.8:
conda env create -f environment.yml
conda activate vit-researchpip install -r requirements.txtNavigate to the notebooks/ directory and launch the modules via VS Code or Jupyter Lab.
- Deep Learning Frameworks: Keras (TensorFlow) and PyTorch.
- Model Architectures: Convolutional Neural Networks (CNNs), Vision Transformers (ViT) and Hybrid CNN-ViT model.
- Data Handling: Geospatial Image Data Loading (memory-based vs. generator-based), Data Augmentation, Preprocessing.
- Advanced Techniques: Transfer Learning (fine-tuning pre-trained models).
- Performance Evaluation: Accuracy, Precision, Recall, F1โ-score, AU-ROC, Confusion Matrix.
- | Deliverables: Jupyter Notebooks (technical rigor)
๐ Attributions & License This project was developed as a Capstone for the IBM Deep Learning Professional Certificate. The core datasets and initial lab structures are provided by IBM Skills Network under their educational terms. All model implementations, hybrid architecture integration, and comparative analyses were performed by me as part of this study.
