Co-occurring keywords
Papers
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
NIPS 2024
Toxicity Detection for Free
NIPS 2024
Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech
ACL 2024
Toxicity Classification in Ukrainian
NAACL 2024
FrenchToxicityPrompts: a Large Benchmark for Evaluating and Mitigating Toxicity in French Texts
COLING 2024