Hush is a research-grade text classifier that flags toxic language in long-form messages by combining character-level TF-IDF extraction with a robust linear classifier. The project is tuned for clarity of metrics, reproducible training, and simple deployment, making it easy for moderators, educators, or open-source contributors to iterate on custom rules or datasets.
- Character-aware embedding:
TfidfVectorizerruns onchar_wbn-grams (3,5 characters) so the model catches insults that span creative spellings or leetspeak. - Balanced linear model:
SGDClassifierwithmodified_huberloss and class weights keeps training fast, stable, and sensitive to the minority toxic class. - Versioned artifacts: Each training run writes timestamped models, vectorizers, and metadata plus
latestcopies for quick inference.
classification_data.csvis the primary labeled corpus (toxic=1, non-toxic=0) thattrainer.pyconsumes. The dataset mixes real and synthetic sentences and already includes an 80/20 split inside the training workflow.classification_data-shona.csvmirrors that labeling format but covers Shona-language statements to help evaluate multilingual generalization.generated_5000_dataset.csvis produced bydata_generator.py, which stitches together templates for supportive and toxic phrasing. Re-run the generator to refresh the synthetic pool when you need more training examples.classification_data-old.csvand the metadata JSON files (e.g.,metadata_v20260311_011709.json) document prior runs or auxiliary exports.
- Install the required packages:
pip install pandas scikit-learn joblib - Adjust
classification_data.csv(or swap ingenerated_5000_dataset.csv) as needed. - Run the training script to produce fresh artifacts.
python trainer.py- Loads the chosen CSV, drops NaNs, and stratifies into an 80/20 train/test split using
train_test_split(random_state=42). - Configures
TfidfVectorizer(analyzer="char_wb", ngram_range=(3,5), max_features=50000)and fits/transforms the text. - Fits
SGDClassifier(loss="modified_huber", penalty="l2", alpha=0.0001, class_weight="balanced", random_state=42)on the vectorized training set. - Computes accuracy and a classification report on the test split, then saves:
toxic_model_v<timestamp>.hushvectorizer_v<timestamp>.hushmetadata_v<timestamp>.json(contains accuracy, precision/recall for the toxic label, and training params)toxic_model_latest.hush/vectorizer_latest.hush(overwrites with the newest run)
python test_model.py- Loads the versioned artifacts referenced at the top of the script (update the filenames if you retrain with new timestamps).
- Runs through curated test cases that cover non-toxic, obviously toxic, subtle toxicity, and edge cases.
- Prints a simple table with pass/fail status plus overall percentage correct.
python model.py "Your message here"
# or run without arguments to use the interactive prompt- Loads the artifacts hard-coded near the top; swap those filenames after retraining.
- Transforms the user text and prints whether Hush considers it toxic.
data_generator.pyregenerates a balanced (2,500/2,500) dataset of synthetic sentences with both polite and aggressive language. Run it to refreshgenerated_5000_dataset.csvor to seed new labels.- Keep
README.md,FIX.md, andmetadata_*.jsonup to date whenever you change the training pipeline so contributors can track regressions.
| File | Purpose |
|---|---|
toxic_model_v<TIMESTAMP>.hush |
Versioned classifier for reproducibility. |
vectorizer_v<TIMESTAMP>.hush |
Matching vectorizer used during training. |
metadata_v<TIMESTAMP>.json |
Stores metrics, parameters, and dataset provenance. |
toxic_model_latest.hush, vectorizer_latest.hush |
Handy shortcuts for inference. |
generated_5000_dataset.csv |
Output of data_generator.py, useful as supplemental training data. |
Hush is MIT-licensed. See LICENSE for the full text.
