Co-occurring keywords
Papers
Blind Men and the Elephant: Diverse Perspectives on Gender Stereotypes in Benchmark Datasets
EMNLP 2025
BehaviorBox: Automated Discovery of Fine-Grained Performance Differences Between Language Models
ACL 2025
DUTJBD at SemEval-2025 Task 3: A Range of Approaches for Predicting Hallucination Generation in Models
SEMEVAL 2025
neDIOM: Dataset and Analysis of Nepali Idioms
COLING 2025
Hallucinations in Code Change to Natural Language Generation: Prevalence and Evaluation of Detection Metrics
IJCNLP 2025