fruity - A project for the course 02476 DTU-MLOps

Group 14 - Members

Anders Henriksen, s183904
August Semrau Andersen, s183918
Karl Byberg Ulbæk, s183931
William Diedrichsen Marstrand, s183921

Project Description

Project Goal

The primary objective of this project is to develop a robust, accurate, and reliable machine learning model dedicated to classifying a diverse array of dishes.

Frontend DEMO

The functionality of this project can be demonstrated using the Streamlit frontend. To run the frontend, use the following command from the root directory of the project:

streamlit run app/frontend/streamlit_frontend.py

Third-party Framework - TIMM

Our project will use the capabilities of the timm (PyTorch Image Models) framework, a rich repository of pre-trained deep neural network models and scripts designed for PyTorch. TIMM's comprehensive selection of models, tailored for image data, presents an ideal foundation for our classification objectives. We plan to utilize TIMM in several key areas:

Access to pre-trained models: TIMM offers a wide range of pre-trained models, which serve as a robust starting point. These models, already trained on large and diverse datasets, can significantly improve learning efficiency and accuracy
Model Experimentation and Selection: Given TIMM's extensive collection of image model architectures, we can easily experiment with different models to find the most suitable one for our project.
Customization and Fine-tuning: TIMM not only offers pre-trained models but also allows customization. We can adjust layers, training parameters, and other aspects of the model to better suit our dataset and classification goals

Dataset

Our data set will be the Food101 dataset

Dataset properties

The total number of images: 101.000 (one dish per image).
The number of classes: 101 - 1000 images per class.
Training set size: 75000 images.
Validation set size 13000 images.
Test set size: 13000 images.
Image size: 100x100 pixels.

Models

We expect to investigate various architectures including:

ResNet Variants: Utilizing their deep architecture for training deeper networks.
EfficientNet Models: Chosen for their balance of accuracy and efficiency.
Vision Transformer (ViT): Leveraging transformer architecture as a CNN alternative for image classification.

Project Structure

The directory structure of the project looks like this:

├── Makefile                    <- Makefile with convenience commands like `make data` or `make train`
├── README.md                   <- The top-level README for developers using this project.
├── conf                        <- Hydra configuration folder
├── data
│   ├── processed               <- The final, canonical data sets for modeling.
│   └── raw                     <- The original, immutable data dump.
│
├── docs                        <- Documentation folder
│   │
│   ├── index.md                <- Homepage for your documentation
│   │
│   ├── mkdocs.yml              <- Configuration file for mkdocs
│   │
│   └── source/                 <- Source directory for documentation files
│
├── models                      <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks                   <- Jupyter notebooks.
│
├── pyproject.toml              <- Project configuration file
│
├── reports                     <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures                 <- Generated graphics and figures to be used in reporting
│
├── requirements.txt            <- The requirements file for reproducing the analysis environment
|
├── requirements_dev.txt        <- The requirements file for reproducing the analysis environment
│
├── tests                       <- Test files
│
├── src                         <- Source code for use in this project.
|   └── fruity
│       │
│       ├── __init__.py         <- Makes folder a Python module
│       │
│       ├── data                <- Scripts to download or generate data
│       │   ├── __init__.py
│       │   └── make_dataset.py
│       │
│       ├── models              <- model implementations, training script and prediction script
│       │   ├── __init__.py
│       │   ├── model.py
│       │
│       ├── visualization       <- Scripts to create exploratory and results oriented visualizations
│       │   ├── __init__.py
│       │   └── visualize.py
│       ├── train.py            <- script for training the model
│       └── predict_model.py    <- script for predicting from a model
│
└── LICENSE                     <- MIT Open-source license

Hydra Configuration Structure

Description of the Hydra config folder structure:

├── config.yaml <- The default settings for your application.
├── dataset     <- Configuration files related to data handling, like paths to datasets, data preprocessing parameters.
├── env         <- Environment-specific settings, such as paths and system configurations that might differ from one machine to another.
├── experiment  <- Used for specific experimental setups.
├── logging     <- Configurations for logging, including log levels, file paths for saving logs, formats, etc.
├── model       <- Model-specific configurations, such as model architecture details, hyperparameters specific to the model, and checkpointing information.
├── optimizer   <- Settings for the optimizer used in training the model. This can include the type of optimizer (e.g., Adam, SGD), learning rate, weight decay, etc.
└── scheduler   <- Scheduler configurations, such as step size, gamma value for learning rate decay, and other scheduler-specific parameters.

Tests

Run tests using command line input, from the root directory of the project:

coverage run --source=fruity -m pytest tests/

Generate coverage report:

coverage report -m

Serving API Locally

To avoid misuse of the Google Cloud resources a credentials file is needed for running the api docker image (even locally). When a credentials .json file is obtained it should be renamed to fuity-api-credentials.json and placed in ~/.config/gcloud. Then run:

make run_local_api_image

Name		Name	Last commit message	Last commit date
Latest commit History 237 Commits
.dvc		.dvc
.github/workflows		.github/workflows
.idea		.idea
app		app
conf		conf
dockerfiles		dockerfiles
docs		docs
notebooks		notebooks
reports		reports
src/fruity		src/fruity
tests		tests
.dvcignore		.dvcignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
foods_101.7z.dvc		foods_101.7z.dvc
fruits_360.7z.dvc		fruits_360.7z.dvc
modelfoods_101.json		modelfoods_101.json
modelfoods_101.pth.dvc		modelfoods_101.pth.dvc
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

fruity - A project for the course 02476 DTU-MLOps

Group 14 - Members

Project Description

Project Goal

Frontend DEMO

Third-party Framework - TIMM

Dataset

Models

Project Structure

Hydra Configuration Structure

Tests

Serving API Locally

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

wdmdev/fruity

Folders and files

Latest commit

History

Repository files navigation

fruity - A project for the course 02476 DTU-MLOps

Group 14 - Members

Project Description

Project Goal

Frontend DEMO

Third-party Framework - TIMM

Dataset

Models

Project Structure

Hydra Configuration Structure

Tests

Serving API Locally

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages