Skip to content

oraziotorre/TTinsight

Repository files navigation

TTinsight

TTinsight logo

TTinsight is a project that leverages advanced machine learning models to predict, in real time, the probability of a player winning or losing a table tennis set by analyzing the score progression up to that point.

Unlike other studies that aim to predict the probability of winning the entire match based on historical statistics of the athletes, our goal is to monitor the progression of the ongoing set and calculate the real-time probability of victory using only the data related to the points scored up to that moment.


Key Features

  • Outcome Prediction: Real-time predictions for set results.
  • Sequence Analysis: Utilizes LSTMs to identify recurring score patterns.
  • Data Flexibility: Compatible with custom score progression datasets.

Repository Structure

The general structure of the TTinsight project is as follows:

.
├── data
│   ├── additional_data
│   │   ├── unavailable_score_tournaments
│   │   │   ├── ...
│   │   ├── players_metadata.tsv
│   │   └── tournaments_metadata.tsv
│   ├── datasets
│   │   ├── cleaned_dataset.csv
│   │   ├── raw_dataset.csv
│   │   └── single_matches_dataset.csv
│   ├── matches
│   │   ├── ...
│   ├── models
│   │   ├── LogReg.pkl
│   │   └── LSTM.keras
│   └── tournaments
│       ├── ...
├── data_preprocessing
│   ├── dataset-generator.py
│   └── downloader-scores.py
├── gui
│   ├── gui-test.py
│   └── TTinsight_GUI_explanation.png
├── README.md
├── requirements.txt
├── TTinsight_DataPreprocessing.ipynb
└── TTinsight_ModelDevelopment.ipynb

Summary of some files

  • data/
    • additional_data
      • players_metadata.tsv: Contains player IDs and nationality, not yet included in the project but may be added later.
      • tournaments_metadata.tsv: Holds metadata not relevant for the current analysis.
    • datasets
      • cleaned_dataset.csv: Used for model training and testing, derived from modifications made to the raw dataset. Contains relevant match data.
      • raw_dataset.csv: Initial dataset with inaccuracies and irrelevant features that need to be cleaned and adjusted.
    • models/
      • LogReg.pkl:

        Predicts match set outcomes (win/loss) based on scores and match dynamics.

        Training: Logistic Regression on standardized data.

        Evaluation: Cross-validation with Log Loss and Brier Score for performance check.

        Hyperparameter Tuning: Optimized using GridSearchCV with regularization and solver adjustments.

        Saving: Best model saved as a .pkl file for future predictions.

      • LSTM.keras:

        Captures temporal match dynamics, predicting outcomes from point sequences and global features.

        Data Preparation: Sequences padded to a fixed length and global features integrated.

        • LSTM Layers: Handle point sequences.
        • Dense Layer: Combine global features.
        • Dropout Layer: Prevents overfitting, followed by final Dense layer with sigmoid for binary classification.

        Training and Evaluation: 20 epochs, binary cross-entropy loss, evaluated with Log Loss and Brier Score.

        Testing: Detects comeback scenarios effectively, unlike Logistic Regression.

    • data_preprocessing/
      • dataset-generator.py

        Processes match logs, validates scores, and creates a cleaned dataset for analysis.

        • Validates scores ensuring minimum standards and valid score differences.
        • Transforms player scores into frequency sequences.
        • Reads match logs, validates scores, and prepares data for analysis.
        • Handles tournament data, associating match scores with details.
        • Processes all tournament files, generating a final CSV dataset with valid match data.
      • dowloader-scores.py

        Scrapes match data asynchronously, gathering game details from a website and logging console output.

        • Interacts with the webpage, collects game data, and logs relevant details.
        • Processes event IDs from TSV files, extracting match data from the website.
        • Launches the scraper, processes files, and saves the logs for future use.
  • TTinsight/
    • TTinsight_DataPreprocessing.ipynb: Colab notebook showing steps for data cleaning and transformation.
    • TTinsight_ModelDevelopment.ipynb: Colab notebook for model training, evaluation, and fine-tuning.
  • README.md: Documentation with project overview, installation, and usage instructions.
  • requirements.txt: Lists dependencies required to run the project.

Visual Representation

Figure: A preview of the TTinsight GUI in action.


Prerequisites

To run this project, you need to have the following installed:

  • Python 3.11
  • pip for dependency management
  • A virtual environment (to manage dependencies in an isolated environment)
  • An IDE like PyCharm, Visual Studio Code, or any other preferred environment.

Steps to Run the Project

1. Clone or Copy the Project

Ensure that you have a copy of the TTinsight project on your machine. If you don’t have it yet, you can clone it from a Git repository or simply copy the project folder.

2. Install Python 3.11

Check that Python 3.11 is installed on your machine. Run the following command:

python --version

If you don’t have Python 3.11, install it by following the instructions for your OS:

3. Create a Virtual Environment

After installing Python 3.11, create a virtual environment to isolate the project’s dependencies.

python3.11 -m venv venv

Activate the virtual environment based on your OS:

  • Windows:
    .\\venv\\Scripts\\activate
  • macOS/Linux:
    source venv/bin/activate

4. Install the Dependencies

Ensure that the requirements.txt file is present in the project folder (it was generated earlier). If you don't have it, you can generate it by running:

To install all the required dependencies, use the following command:

pip install -r requirements.txt

5. Set Up Your IDE

Depending on which IDE you’re using, follow these steps to configure it to use the Python 3.11 virtual environment.

Visual Studio Code (VS Code)

  1. Install VS Code from the official site: https://code.visualstudio.com/
  2. Open the project folder in VS Code: Go to File > Open Folder... and select your project folder.
  3. Create or Activate the Virtual Environment: If you haven’t done so already, activate the virtual environment:
    • On Windows:
      .\\venv\\Scripts\\activate
    • On macOS/Linux:
      source venv/bin/activate
  4. Install the Dependencies (if you haven't done so):
pip install -r requirements.txt
  1. Select the Python Interpreter: Open the Command Palette (Ctrl + Shift + P or Cmd + Shift + P on macOS), search for Python: Select Interpreter, and choose the interpreter for your virtual environment.
  2. Run the Project: You can now run the Python scripts directly in the VS Code terminal, or start the GUI.

PyCharm

  1. Install PyCharm from the official site: https://www.jetbrains.com/pycharm/
  2. Open the Project: Go to File > Open and select your project folder.
  3. Set Up the Python Interpreter:
  • Navigate to File > Settings (or PyCharm > Preferences on macOS).
  • Under Project: TTinsight, click on Python Interpreter.
  • Click the gear icon, then choose Add. Select Existing environment and point to your virtual environment’s interpreter:
    • Windows: .\\venv\\Scripts\\python.exe
    • macOS/Linux: venv/bin/python
  1. Install the Dependencies: If you haven't already installed the dependencies, you can open the terminal within PyCharm and run:
pip install -r requirements.txt
  1. Run the Project: Now, you can run your Python scripts or the GUI directly from PyCharm.

Common Issues

1. Incompatible Versions Error

If you encounter errors related to incompatible library versions, make sure you have Python 3.11 and the versions of libraries specified in the requirements.txt file. You can also try creating a new virtual environment and reinstalling the dependencies.

2. Permission Errors

If you get permission-related errors (e.g., PermissionError) while installing libraries, try using sudo on macOS/Linux or running as an administrator on Windows.

3. Not Loading

If you encounter problem with the demo not showing up you may try restarting your IDE


Contributing

If you would like to contribute to the TTinsight project, feel free to submit a pull request or open an issue on GitHub. All contributions are welcome!

About

TTinsight - Real-time prediction system that uses machine learning to estimate a player's chances of winning or losing a table tennis set based on score progression.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors