Adam Gleave
12 papers · 2016–2026 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+3 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (14) π§ Keyword Pioneer π Cross-Pollinator (8) π Interdisciplinary Bridge π Conference Polyglot (6)
π
Academic Marathon
(9)
π
Century Club
(11)
β
The Questioner
Conferences
AAAI (3)
ICLR (3)
ICML (3)
EMNLP (1)
JMLR (1)
OSDI (1)
Top co-authors
Research topics
Keywords
adversarial attack
(3)
large language model
(2)
data poisoning
(1)
game artificial intelligence
(1)
adversarial training
(1)
harmful content
(1)
game playing
(1)
partial identifiability
(1)
worst-case performance
(1)
reward function
(1)
reward learning
(1)
backdoor attack
(1)
safety alignment
(1)
expert demonstration
(1)
game theory
(1)
zero-shot transfer
(1)
model scaling
(1)
red teaming
(1)
fine-tuning attack
(1)
adversarial robustness
(1)
Papers
STACK: Adversarial Attacks on LLM Safeguard Pipelines
AAAI 2026
Can Go AIs Be Adversarially Robust?
AAAI 2025
Jailbreak-Tuning: Models Efficiently Learn Jailbreak Susceptibility
EMNLP 2025
Scaling Trends for Data Poisoning in LLMs
AAAI 2025
Scaling Trends in Language Model Robustness
ICML 2025
STARC: A General Framework For Quantifying Differences Between Reward Functions
ICLR 2024
Invariance in Policy Optimisation and Partial Identifiability in Reward Learning
ICML 2023
Adversarial Policies Beat Superhuman Go AIs
ICML 2023
Quantifying Differences in Reward Functions
ICLR 2021
Stable-Baselines3: Reliable Reinforcement Learning Implementations
JMLR 2021
Adversarial Policies: Attacking Deep Reinforcement Learning
ICLR 2020
Firmament: Fast, Centralized Cluster Scheduling at Scale
OSDI 2016