Highlights

SAJ is a novel deep learning model for imputation of missing values in water quality data.
SAJ integrates innovative approaches: RMSNorm, Lower Triangular Masking Multi-Head Attention, and Dimension Splitting, never explored in any imputation tasks before.
The model is tested on real world water quality data of different sizes and high missing rates.
SAJ exhibits state-of-the-art achievements on imputation tasks of water quality data.

README

It is an innovative pytorch implementation of the paper "SAJ: A Deep Learning Approach to Missing Data Imputation in Water Quality Datasets, Ishan Prasad Banjara, Deepesh Upreti, Kalam Pariyar, Suman Poudel, Shukra Paudel".

In the present age of environmental degradation, extracting meaningful insights from large amount of water quality data is crucial for minimizing the effect of anthropogenic activities on dwindling water resources. A key problem encountered is the lack of accurate and reliable data, especially caused by high missingness, which impairs the ability of decision makers to take timely actions for mitigating the environmental damage. The study aims to introduce a deep learning (DL) model named as SAJ: Self-Attention Joint with convolution, for efficient imputation of missing water quality data, under very high missingness scenarios (~ 90%). SAJ is an innovative DL model encapsulating convolutional neural network and self-attention mechanism along with integration of lower triangular masking and RMS normalization technique, all of which provides the architectural novelty to the model. For two different water quality datasets, the model outcompetes other State-of-the-Art (SOTA) models, demomstrating lower value of error metrics which substantiates its excellence in imputation tasks. Similarly, SAJ demonstrated reduced average inference time and number of parameters, further validating its superiority. Finally, the model also holds the promise for further improvements, indicating its potential for dominating the domain of water quality data imputation by even greater margin.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Hyperparameter tuning results		Hyperparameter tuning results
Plot		Plot
USGSMuddyFK		USGSMuddyFK
USGSOhioRiver		USGSOhioRiver
USGSSacramento		USGSSacramento
__pycache__		__pycache__
ablation studies		ablation studies
data_desc_and_stats		data_desc_and_stats
output		output
preprocessed_data		preprocessed_data
raw_data		raw_data
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
SVRHyperparameterTuningwithGoogleColab.ipynb		SVRHyperparameterTuningwithGoogleColab.ipynb
USGSMuddyFK_train_loss_vs_val_loss.csv		USGSMuddyFK_train_loss_vs_val_loss.csv
USGSOhioRiver_train_loss_vs_val_loss.csv		USGSOhioRiver_train_loss_vs_val_loss.csv
brits.py		brits.py
config.py		config.py
data_desc.py		data_desc.py
data_preprocessing_utils.py		data_preprocessing_utils.py
dataloader.py		dataloader.py
hyperparametertuning.py		hyperparametertuning.py
layers.py		layers.py
naive_models.py		naive_models.py
plot.py		plot.py
requirements.txt		requirements.txt
saits.py		saits.py
saj.py		saj.py
train.py		train.py
trainvaltest_models.py		trainvaltest_models.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Highlights

README

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Highlights

README

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages