Liwei Jiang
32 papers · 2021–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
🌍 Conference Polyglot (8) 🐝 Cross-Pollinator (12) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (8)
🧭
Keyword Pioneer
🌈
Renaissance Researcher
(8)
🏆
Grand Slam
👑
Triple Crown
🤝
Dynamic Duo
(28)
🔥
Unstoppable
(5)
💎
Century Club
(32)
⚡
Prolific Year
(8)
🗃️
Keyword Collector
(127)
❓
The Questioner
(2)
Conferences
EMNLP (7)
NAACL (6)
NIPS (5)
ACL (4)
ICLR (4)
ICML (3)
AAAI (2)
COLT (1)
Top co-authors
Keywords
large language model
(8)
language model
(6)
commonsense reasoning
(6)
reinforcement learning
(4)
knowledge graph
(3)
knowledge distillation
(3)
value alignment
(3)
social commonsense
(2)
moral reasoning
(2)
multi-task learning
(2)
symbolic knowledge
(2)
adversarial attack
(2)
text generation
(2)
content moderation
(2)
dialogue system
(2)
model compression
(2)
ai safety
(2)
kl divergence
(1)
data augmentation
(1)
model security
(1)
Papers
Position: Political Neutrality in AI Is Impossible — But Here Is How to Approximate It
ICML 2025
DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
ICLR 2025
AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
ICLR 2025
SafetyAnalyst: Interpretable, Transparent, and Steerable Safety Moderation for AI Behavior
ICML 2025
Can Language Models Reason about Individualistic Human Values and Preferences?
ACL 2025
CulturalBench: A Robust, Diverse and Challenging Benchmark for Measuring LMs’ Cultural Knowledge Through Human-AI Red-Teaming
ACL 2025
Guardrails and Security for LLMs: Safe, Secure and Controllable Steering of LLM Applications
ACL 2025
Online Covariance Estimation in Nonsmooth Stochastic Approximation
COLT 2025
To Err Is AI: A Case Study Informing LLM Flaw Reporting Practices
AAAI 2025
Impossible Distillation for Paraphrasing and Summarization: How to Make High-quality Lemonade out of Small, Low-quality Model
NAACL 2024
Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement
ICLR 2024
The Generative AI Paradox: “What It Can Create, It May Not Understand”
ICLR 2024
WildGuard: Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
NIPS 2024
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
NIPS 2024
Position: A Roadmap to Pluralistic Alignment
ICML 2024
Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties
AAAI 2024
JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models
NAACL 2024
SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization
EMNLP 2023
Faith and Fate: Limits of Transformers on Compositionality
NIPS 2023
ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations
ACL 2023
Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms
EMNLP 2023
BiasX: “Thinking Slow” in Toxic Content Moderation with Explanations of Implied Social Biases
EMNLP 2023
Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
EMNLP 2023
NovaCOMET: Open Commonsense Foundation Models with Symbolic Knowledge Distillation
EMNLP 2023
What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations
EMNLP 2023
Symbolic Knowledge Distillation: from General Language Models to Commonsense Models
NAACL 2022
ProsocialDialog: A Prosocial Backbone for Conversational Agents
EMNLP 2022
QUARK: Controllable Text Generation with Reinforced Unlearning
NIPS 2022
NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics
NAACL 2022
Aligning to Social Norms and Values in Interactive Narratives
NAACL 2022
“I’m Not Mad”: Commonsense Implications of Negation and Contradiction
NAACL 2021
Rank Overspecified Robust Matrix Recovery: Subgradient Method and Exact Recovery
NIPS 2021