The name Mojmelo is derived from the "Mojo Machine Learning" expression. It includes the implementation of Machine Learning algorithms from scratch in pure Mojo.
Here is the list of the algorithms:
- Linear Regression
- Polynomial Regression
- Logistic Regression
- KNN
- KMeans
- HDBSCAN
- DBSCAN
- SVM
- Naive Bayes
- GaussianNB
- MultinomialNB
- Decision Tree (Regression/Classification)
- Random Forest (Regression/Classification)
- GBDT (Regression/Classification)
- PCA
Preprocessing:
- normalize
- MinMaxScaler
- StandardScaler
- KFold
- GridSearchCV
- LabelEncoder
Documentation: https://yetalit.github.io/Mojmelo/docs/_index.html
If you are not familiar with Mojo projects, you can get started here: https://mojolang.org/docs/manual/get-started/
- mojo-compiler 1.0.0b1
Optionally, bellow Python packages can be installed for a better usability and to run tests:
- Numpy
- Pandas
- Scikit-learn
- Matplotlib
There are three ways to install mojmelo: Using Pixi CLI, PyPI CLI and through the source code.
Additionally, completing the setup process (discussed later) is recommended.
Make sure you have the Modular community channel (https://repo.prefix.dev/modular-community) in your pixi.toml file in the channels section, then add mojmelo this way:
pixi add mojmelo
To start the setup process, run the following command from the main folder of your project:
bash ./.pixi/envs/default/etc/conda/test-files/mojmelo/0/tests/setup.sh
Note: If CPU cache details are available by the OS, benchmarking parts of the setup will be skipped. Otherwise, please try not to run other tasks on your pc during the process for better results.
Using the command below, the PyPI package containing the source code will be installed from the github repository:
pip install "git+https://github.com/yetalit/Mojmelo.git#subdirectory=pypi"
Then start the setup process this way:
mojmelo-setup
Note: If CPU cache details are available by the OS, benchmarking parts of the setup will be skipped. Otherwise, please try not to run other tasks on your pc during the process for better results.
Mojmelo can also be installed through the source code. This way, you will have the source code in your project.
First, Download mojmelo folder and setup.mojo file. To start the setup process, run these commands from where mojmelo folder and setup.mojo file are stored:
mojo build setup.mojo -o setup &&
./setup &&
./setup 1 &&
./setup 2 &&
./setup 3 &&
./setup 4 &&
./setup 5 &&
./setup 6 &&
./setup 7 &&
./setup 8 &&
./setup 9 &&
rm -f ./setup
Note: If CPU cache details are available by the OS, benchmarking parts of the setup will be skipped. Otherwise, please try not to run other tasks on your pc during the process for better results.
Importing models is straightforward:
from mojmelo.LinearRegression import LinearRegressionYou may also want to use the utility codes written for this project:
from mojmelo.utils.Matrix import Matrix
from mojmelo.utils.utils import *| Model | Fit Time (s) | ARI vs sklearn | ARI vs truth |
|---|---|---|---|
| sklearn KMeans | 0.2716 ± 0.0012 | - | 0.9389 |
| mojmelo KMeans | 0.1870 ± 0.0052 | 0.8821 | 0.9389 |
HDBSCAN (algorithm='boruvka_kdtree')
| Model | Fit Time (s) | ARI vs sklearn | ARI vs truth |
|---|---|---|---|
| skl-contrib HDBS | 1.1495 ± 0.0083 | - | 0.9997 |
| mojmelo HDBS | 0.3198 ± 0.0079 | 0.9930 | 0.9932 |
DBSCAN (algorithm='kd_tree')
| Model | Fit Time (s) | ARI vs sklearn | ARI vs truth |
|---|---|---|---|
| sklearn DBS | 1.1434 ± 0.0055 | - | 0.8566 |
| mojmelo DBS | 0.4028 ± 0.0038 | 0.9996 | 0.8566 |
KNN (algorithm='kd_tree')
| Model | Fit Time (s) | Predict Time (s) | Accuracy |
|---|---|---|---|
| sklearn KNN | 0.0353 ± 0.0005 | 1.7600 ± 0.0063 | 0.8543 |
| mojmelo KNN | 0.0149 ± 0.0006 | 0.2126 ± 0.0040 | 0.8347 |
| Model | Fit Time (s) | Predict Time (s) | Accuracy |
|---|---|---|---|
| sklearn SVM | 1.0595 ± 0.0010 | 0.3066 ± 0.0002 | 0.9798 |
| mojmelo SVM | 0.8733 ± 0.0129 | 0.0603 ± 0.0032 | 0.9797 |
| Model | Fit Time (s) | Predict Time (s) | Accuracy |
|---|---|---|---|
| sklearn DTC | 0.9051 ± 0.0008 | 0.0004 ± 0.0000 | 0.9300 |
| mojmelo DTC | 0.0749 ± 0.0028 | 0.0002 ± 0.0000 | 0.9328 |
| Model | Fit Time (s) | Predict Time (s) | MSE |
|---|---|---|---|
| sklearn DTR | 0.6466 ± 0.0006 | 0.0005 ± 0.0000 | 8247.9358 |
| mojmelo DTR | 0.0795 ± 0.0049 | 0.0003 ± 0.0000 | 8192.1982 |
| Model | Fit Time (s) | Predict Time (s) | Accuracy |
|---|---|---|---|
| sklearn RFC | 0.4707 ± 0.0064 | 0.0140 ± 0.0003 | 0.9182 |
| mojmelo RFC | 0.4534 ± 0.0094 | 0.0040 ± 0.0000 | 0.9174 |
| Model | Fit Time (s) | Predict Time (s) | MSE |
|---|---|---|---|
| sklearn RFR | 2.0257 ± 0.0050 | 0.0134 ± 0.0004 | 8454.5517 |
| mojmelo RFR | 1.2247 ± 0.0094 | 0.0067 ± 0.0002 | 9155.6895 |
PCA (svd_solver='full')
| Model | Fit Time (s) | Transform Time (s) | Explained Var |
|---|---|---|---|
| sklearn PCA | 0.2070 ± 0.0025 | 0.0061 ± 0.0000 | 0.5363 |
| mojmelo PCA | 0.0737 ± 0.0003 | 0.0270 ± 0.0015 | 0.5363 |
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
Contributions can be done to the project in these 3 ways:
- Applying improvements to the code and opening a Pull Request
- Reporting a bug
- Suggesting new features
-
Mojo usage and distribution are licensed under the Modular Community License.
-
Libsvm, A Library for Support Vector Machines by Chih-Chung Chang and Chih-Jen Lin licensed under the BSD-3-Clause license.
-
HDBSCANimplementation is partially based on hdbscan by Leland McInnes, John Healy and Steve Astels licensed under the BSD-3-Clause license and Fast Multicore HDBSCAN by Tutte Institute for Mathematics and Computing licensed under the BSD-2-Clause license. -
matmulimplementation is based on matmul.mojo by Ethan Wu (YichengDWu) licensed under the Apache-2.0 license. -
argmin,argmaxandargsortimplementations are based on codes from Modular licensed under the Apache License v2.0 with LLVM Exceptions. -
KDTREE2, a kd-tree implementation in Fortran 95 and C++ by Matthew B. Kennel.
-
Initially drew inspiration from Patrick Loeber's MLfromscratch.