Papers

2,781 papers found
Scaling Trends for Data Poisoning in LLMs
Dillon Bowen, Brendan Murphy, Will Cai et al.
2025 AAAI
ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates
Fengqing Jiang, Zhangchen Xu, Luyao Niu et al.
2025 AAAI
RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios?
Adrian de Wynter, Ishaan Watts, Tua Wongsangaroonsri et al.
2025 AAAI
2025 AAAI
An Automated Explainable Educational Assessment System Built on LLMs
Jiazheng Li, Artem Bobrov, David West et al.
2025 AAAI
Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Yohan Mathew, Ollie Matthews, Robert McCarthy et al.
2025 AACL