Skip to content

This repository contains the Python code used for data preparation in the STAY project. The README file explains all the steps required to reproduce the work, even in other contexts. For more details, please visit the link below.

Notifications You must be signed in to change notification settings

ITSAIDI/STAY_DEV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository contains the source code for the Pre-Visualisation stages of the STAY project.

Prerequisites

  • uv installed

Install uv with pip:

pip install uv

For more installation options go to uv installation

Setup

  1. Clone the repository :
git clone https://github.com/ITSAIDI/STAY_DEV
cd STAY_DEV
  1. Create virtual environment with uv:
uv venv --python 3.10.16
  • We set the python version already used.
  • The venv is activated by default.
  1. Install dependencies:
uv sync

--> You can start now work with any file.

Collecting

  1. You need first to generate a Youtube_API_KEY
  2. Create a .env file on the root of the cloned repository and add your key there as YOUTUBE_API_KEY
  3. Go to the main.ipynb in collecting folder and run the cells, a queries.json file is already there.

Filtering

Videos

  1. We used the free version of gemini-flash for the filtering, then you need first to generate a GEMINI_API_KEY
  2. Add the generated key as envirement variable.
  3. Go to the main.ipynb in filtering/videos folder, there are three levels of filtreing each one generates a json file with result of the applied filters. The Refinements step is necessary to prepare data for filtering.

Output : videosF3.json file with all relevant videos.

Channels

Filtering process for channels is in a main.ipynb at filtering/channels folder.

Output : channelsF3.json file for relevant channels and channelsF3Non.json for irrelevant ones.

Local DataBase Updating

To update the Posgres Database with new relevant videos and channels you need to :

  1. First to set your POSTGRE_PASSWORD as envirement variable, then you need the outputs of the filtering step.
  2. Go to the main.ipynb in DataBase folder and folllow the steps.

For updating only the metrics of existing videos and channels you have to :

  1. Open the project folder in your Code Editor (VSCode for example).
  2. Ensure that you virtual envirement is activated.
  3. Open a new powershell, get into the Database folder and run the python script updateMetrics.py like this :
   uv run updateMetrics.py YOUR_POSTGRE_PASSWORD YOUR_YOUTUBE_API_KEY

The script will connect to your database then call two python functions one for channels metrics and the other for videos metrics.

Server DataBase Updating

The server contain already the updateMetrics.py, after connecting to the server you :

  1. Activate the virtual envirement
    source venv/bin/activate
  1. Execute the following command :
   python3 run updateMetrics.py YOUR_POSTGRE_PASSWORD YOUR_YOUTUBE_API_KEY

About

This repository contains the Python code used for data preparation in the STAY project. The README file explains all the steps required to reproduce the work, even in other contexts. For more details, please visit the link below.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published