Repository of the paper "Supporting Online Toxicity Detection with Knowledge Graphs" (ICWSM 22).
-
Updated
Jul 2, 2024 - Python
Repository of the paper "Supporting Online Toxicity Detection with Knowledge Graphs" (ICWSM 22).
Zero-shot LLM toxicity classification on Civil Comments. Compares Integrated Gradients vs. attention attribution consistency, evaluates demographic fairness (SPD, EOpp), and provides an interactive Streamlit explorer with local and Gemini API backends.
Thesis project analyzing 1.7M Civil Comments to detect emerging topics and trends using BERTopic (MiniBatchKMeans + HDBSCAN), five-method consensus burst detection, and a novel annotation-free article-based evaluation framework includes information loss benchmarking and LLM-generated topic presentations
Fairness audit of Detoxify's toxicity classifier on Civil Comments quantifying demographic bias via FPR disparity, subgroup AUC, counterfactual analysis, and intersectionality, with threshold-optimization mitigation
Add a description, image, and links to the civil-comments topic page so that developers can more easily learn about it.
To associate your repository with the civil-comments topic, visit your repo's landing page and select "manage topics."