conftrace_

Laura Weidinger

4 papers · 2022–2025 · 4 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

🌍 Conference Polyglot (4) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🐝 Cross-Pollinator (9)

Conferences

EMNLP (1) ICLR (1) NAACL (1) NIPS (1)

Top co-authors

Iason Gabriel (3) William Isaac (3) Lisa Anne Hendricks (2) Canfer Akbulut (2) Verena Rieser (2) Mark Díaz (2) Maribeth Rauh (2) Nahema Marchal (2) Amelia Glaese (1) Olivia Wiles (1)

Keywords

toxicity detection (1) language model evaluation (1) responsible ai (1) ai safety (1) harmful content (1) language model (1) safety evaluation (1) red teaming (1) human annotation (1) social norm (1) nlp system (1) offensive speech (1) large language model (1) language model safety (1) relational context (1) queer community (1) parameterised instruction (1) risk surface (1) parameterized instruction (1) harm annotation (1)

Papers

Century: A Framework and Dataset for Evaluating Historical Contextualisation of Sensitive Images ICLR 2025 STAR: SocioTechnical Approach to Red Teaming Language Models EMNLP 2024 Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models NIPS 2022 Accounting for Offensive Speech as a Practice of Resistance NAACL 2022