Ian Kivlichan
4 papers · 2021–2024 · 4 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓
🌍
Conference Polyglot
(4)
🌉
Interdisciplinary Bridge
🧭
Keyword Pioneer
🐣
Hot Topic Early Bird
🐝
Cross-Pollinator
(15)
Conferences
ACL (1)
EACL (1)
IJCNLP (1)
NIPS (1)
Top co-authors
Keywords
content moderation
(2)
uncertainty estimation
(2)
human-ai collaboration
(2)
model uncertainty
(1)
preference modeling
(1)
reward model
(1)
bert fine-tuning
(1)
collaborative system
(1)
toxic speech detection
(1)
large language model
(1)
moderator-model system
(1)
collaborative review
(1)
reinforcement learning
(1)
covert toxicity
(1)
bayesian inference
(1)
decision making
(1)
Papers
Rule Based Rewards for Language Model Safety
NIPS 2024
Measuring and Improving Model-Moderator Collaboration using Uncertainty Estimation
ACL 2021
Capturing Covertly Toxic Speech via Crowdsourcing
EACL 2021
Measuring and Improving Model-Moderator Collaboration using Uncertainty Estimation
IJCNLP 2021