Skip to content

Nourah86/scikit-longitudinal

 
 

Repository files navigation


Scikit-longitudinal
Scikit-longitudinal

A specialised Python library for longitudinal data analysis built on Scikit-learn


💡 About The Project

Scikit-longitudinal (Sklong) is a machine learning library designed to analyse longitudinal data (Classification tasks focussed as of today). It offers tools and models for processing, analysing, and predicting longitudinal data, with a user-friendly interface that integrates with the Scikit-learn ecosystem.

For more details, visit the official documentation.


🛠️ Installation

Note

Want to be using Jupyter Notebook, Marimo, Google Colab, or JupyterLab? Head to the Getting Started section of the documentation, we explain it all! 🎉

To install Scikit-longitudinal:

  1. ✅ Install the latest version:

    pip install Scikit-longitudinal

    To install a specific version:

    pip install Scikit-longitudinal==0.1.0

Caution

Scikit-longitudinal is currently compatible with Python versions 3.9 only. Ensure you have one of these versions installed before proceeding with the installation.

Now, while we understand that this is a limitation, we are tied for the time being because of Deep Forest. Deep Forest is a dependency of Scikit-longitudinal that is not compatible with Python versions greater than 3.9. Deep Forest helps us with the Deep Forest algorithm, to which we have made some modifications to welcome Lexicographical Deep Forest.

To follow up on this discussion, please refer to this github issue.

If you encounter any errors, feel free to explore further the installation section in the Getting Started of the documentation. If it still doesn't work, please open an issue on GitHub.


🚀 Getting Started

Here's how to analyse longitudinal data with Scikit-longitudinal:

from scikit_longitudinal.data_preparation import LongitudinalDataset
from scikit_longitudinal.estimators.ensemble.lexicographical.lexico_gradient_boosting import LexicoGradientBoostingClassifier

dataset = LongitudinalDataset('./stroke.csv') # Note this is a fictional dataset. Use yours!
dataset.load_data_target_train_test_split(
  target_column="class_stroke_wave_4",
)

# Pre-set or manually set your temporal dependencies 
dataset.setup_features_group(input_data="elsa")

model = LexicoGradientBoostingClassifier(
  features_group=dataset.feature_groups(),
  threshold_gain=0.00015 # Refer to the API for more hyper-parameters and their meaning
)

model.fit(dataset.X_train, dataset.y_train)
y_pred = model.predict(dataset.X_test)

# Classification report
print(classification_report(y_test, y_pred))

📝 How to Cite

We are currently cooking a JOSS submission, wait a bit for it! Meanwhile, click on Cite This Repository on the top right corner of this page to get a BibTeX entry.


🔐 License

Scikit-longitudinal is licensed under the MIT License.

About

☂️ Scikit-longitudinal (Sklong) is an open-source Python library & Scikit-Learn API compliant, tailored to longitudinal machine learning classification tasks. It is ideal for researchers, data scientists, and analysts, as it provides specialist tools for dealing with repeated-measures data challenges

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 99.8%
  • Shell 0.2%