conftrace_

Ian Kivlichan

4 papers · 2021–2024 · 4 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

🌍 Conference Polyglot (4) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🐝 Cross-Pollinator (15)

Conferences

ACL (1) EACL (1) IJCNLP (1) NIPS (1)

Top co-authors

Zi Lin (2) Lucy Vasserman (2) Jeremiah Liu (2) Tong Mu (1) Jorge Nario (1) Alex Beutel (1) Tesh Goyal (1) Alec Helyar (1) John Schulman (1) Andrea Vallone (1)

Keywords

content moderation (2) uncertainty estimation (2) human-ai collaboration (2) model uncertainty (1) preference modeling (1) reward model (1) bert fine-tuning (1) collaborative system (1) toxic speech detection (1) large language model (1) moderator-model system (1) collaborative review (1) reinforcement learning (1) covert toxicity (1) bayesian inference (1) decision making (1)

Papers

Rule Based Rewards for Language Model Safety NIPS 2024 Measuring and Improving Model-Moderator Collaboration using Uncertainty Estimation ACL 2021 Capturing Covertly Toxic Speech via Crowdsourcing EACL 2021 Measuring and Improving Model-Moderator Collaboration using Uncertainty Estimation IJCNLP 2021