Local development

Requirements

Download the data sample from: http://repo.pi.ingv.it/instance/Instance_sample_dataset_v2.tar.bz2
Create an environment variable file, called .env at the root of the project. This is to set the data and output paths. Change the variable value to your actual data and output location. The folder and files should exist before you run the code.

EVENT_HDF5_FILE="data/instance_samples/Instance_events_counts_10k.hdf5"
EVENT_METADATA_FILE="data/instance_samples/metadata_Instance_events_10k.csv"
NOISE_HDF5_FILE="data/instance_samples/Instance_noise_1k.hdf5"
NOISE_METADATA_FILE="data/instance_samples/metadata_Instance_noise_1k.csv"
FINAL_OUTPUT_DIR="output"
TEMP_DIR="temp"

Create a venv environment using python 3.11 and install the packages in the requirements.txt found at the root of the project. Use this environment to run the code.

Training existing models (in the report)

EQ Model, based on transformers:

Run python train_my_eq.py from the root of the project
Check progress in the terminal. At the end, the result will be added to the output file mentionned in the .env file.

CNN model

Run python train_my_cnn1.py or python train_my_cnn2.py
Check progress in the terminal. At the end, the result will be added to the output file mentionned in the .env file.

Train your own model

Copy one of the file train_my_eq.py and edit it to change the model and hyperparameters
Train your model by raining python train_my_own_model.py

Cluster training

Data on the cluster

Instance data have been downloaded and available here: ~/projects/def-sponsor00/earthquake/data/instance
STEAD data are not yet downloaded but can be added here: ~/projects/def-sponsor00/earthquake/data/stead
You want to downlaod in parallel for STEAD:

put all your the urls in a file (files.txt)
and do: cat files.txt | xargs -n 1 -P 0 wget -q
-P 0 let xargs choose the number of parallels work. You can assign a hard number if you want

Must

Use tmux to run your session. Once connected on the server:

Type tmux to start a new session or tmux attach to recover from an old session. I suggest using it as tmux will keep your terminal session running even if you loose connection. Otherwise you might need to start from scratch
Type ctrl-b + % to split your screen and ctrl-b <arrow> to navigate through the panes. I use it to be able to run simultaneously multiple terminals as one might be blocked by a long running task.
Use ctrl-b + z to toggle one pane full screen or not
Type exit to close tmux pane

Steps to train in the cluster

Using the terminal, login to your cluster, preferrably through ssh (e.g: ssh username@ift6759.calculquebec.cloud)
Create a folder where you will clone your repo: mkdir documents
Get into the folder and clone the repo or pull if already cloned before (using ssh preferrably):
- cd documents
- git clone git@github.com:damoursm/earthquake.git OR git pull
Get into the scripts folder and run the setup script. That will create sbatch and scratch folder and move code and scripts to proper location
- cd scripts
- ./setup.sh
Go back to home Add your .env file in the code folder
- cd ~
- vim scratch/code-snapshots/earthquake/.env
- Set the variable as in the above section. You can leave all variables except FINAL_OUTPUT_DIR with empty values ("") as code is detecting cluster and selecting where the code is.
- Set FINAL_OUTPUT_DIR="scratch/<your username>/output/default-train". default-train is used as default but you can change it if you want to save output of different experiments. Just make sure the folder exists
Keep in the home folder and start training. Replace train_transformer_elisee.py with the file containing the code (For EqModel, use train_my_eq.py and for Cnn, use train_my_cnn.py)
- cd ~
- ./sbatch/run.sh -p train_transformer_elisee.py. Here you can optionally specifiy few arguments. -m 16Gb for memory (by default 8Gb). -t hh:mm:ss for how long to run (by default 1H). -p /train_xxx.py for the file to execute (by default it will run train.py). -c 1 for the number of cpu to use. g 1 for the number of gpu to use.
Once training is done, the files will be in the FINAL_OUTPUT_DIR specified in the .env. To download them 1 by 1 on your local computer, use this command line: scp <username>@ift6759.calculquebec.cloud:/scratch/<username>/output/default-train/<filename> <local path e.g /Users/ekabore/Downloads>
Useful slurm commands:
- squeue -u username : will show the current job being submitted
- scontrol show job jobid : show details about the job
- scancel jobid : cancels a job

Training your own Random Forest model

Start the mlflow server by running mlflow server in the terminal
Fill you hyperparameters and configuration in the config file config.py
Activate the environment earthquake
Run the script python main.py
You can access the MLflow experiment in the UI by going to http://localhost:5000 in your browser

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
mlruns		mlruns
nn		nn
output/figures		output/figures
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md
cnn1_final.py		cnn1_final.py
cnn2_final.py		cnn2_final.py
config.py		config.py
data_load.ipynb		data_load.ipynb
explore_validate_data.ipynb		explore_validate_data.ipynb
main.py		main.py
requirements-cuda.txt		requirements-cuda.txt
requirements-python-3.9.txt		requirements-python-3.9.txt
requirements.txt		requirements.txt
test_instance_elisee_basic.ipynb		test_instance_elisee_basic.ipynb
test_instance_yasmina.ipynb		test_instance_yasmina.ipynb
train_my_cnn1.py		train_my_cnn1.py
train_my_cnn2.py		train_my_cnn2.py
train_my_eq.py		train_my_eq.py
train_my_eq_cluster.py		train_my_eq_cluster.py
train_my_eq_to_delete.py		train_my_eq_to_delete.py
train_my_own_failed.py		train_my_own_failed.py
train_pytorch_template.py		train_pytorch_template.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local development

Requirements

Training existing models (in the report)

EQ Model, based on transformers:

CNN model

Train your own model

Cluster training

Data on the cluster

Must

Steps to train in the cluster

Training your own Random Forest model

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Local development

Requirements

Training existing models (in the report)

EQ Model, based on transformers:

CNN model

Train your own model

Cluster training

Data on the cluster

Must

Steps to train in the cluster

Training your own Random Forest model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages