Amelia Glaese
5 papers · 2021–2025 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+1 more ↓ Show less ↑
π Conference Polyglot (3) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (12) π§ Keyword Pioneer π£ Hot Topic Early Bird
π
Cross-Pollinator
(15)
Conferences
EMNLP (2)
NIPS (2)
ICML (1)
Top co-authors
Keywords
large language model
(3)
harmful content
(2)
toxicity detection
(2)
language model
(2)
responsible ai
(1)
bias mitigation
(1)
reward model
(1)
safety evaluation
(1)
red teaming
(1)
harmful content detection
(1)
human preference
(1)
adversarial testing
(1)
automatic evaluation
(1)
model fairness
(1)
offensive content detection
(1)
model bia
(1)
toxicity mitigation
(1)
reinforcement learning
(1)
offensive content
(1)
prompt engineering
(1)
Papers
PaperBench: Evaluating AIβs Ability to Replicate AI Research
ICML 2025
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
NIPS 2022
Fine-tuning language models to find agreement among humans with diverse preferences
NIPS 2022
Red Teaming Language Models with Language Models
EMNLP 2022
Challenges in Detoxifying Language Models
EMNLP 2021