Bochuan Cao
14 papers · 2022–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+6 more ↓ Show less ↑
π Cross-Pollinator (6) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (6) π Renaissance Researcher (6)
π
Interdisciplinary Bridge
π§
Keyword Pioneer
π€
Dynamic Duo
(10)
β‘
Prolific Year
(5)
ποΈ
Keyword Collector
(52)
π
Century Club
(14)
Conferences
ACL (4)
NIPS (4)
ICML (2)
NAACL (2)
AACL (1)
IJCNLP (1)
Top co-authors
Research topics
Keywords
large language model
(7)
model alignment
(3)
ai safety
(3)
moral reasoning
(2)
hallucination mitigation
(2)
adversarial attack
(2)
jailbreak attack
(2)
intrinsic self-correction
(2)
ensemble learning
(1)
convergence analysis
(1)
domain generalization
(1)
language model alignment
(1)
continual learning
(1)
harmful content
(1)
preference optimization
(1)
text generation
(1)
model security
(1)
distribution shift
(1)
backdoor attack
(1)
factual accuracy
(1)
Papers
WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response
NAACL 2025
On the Convergence of Moral Self-Correction in Large Language Models
AACL 2025
JoPA: Explaining Large Language Modelβs Generation via Joint Prompt Attribution
ACL 2025
Monitoring Decoding: Mitigating Hallucination via Evaluating the Factuality of Partial Response during Generation
ACL 2025
TruthFlow: Truthful LLM Generation via Representation Flow Correction
ICML 2025
AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion Models
ICML 2025
On the Convergence of Moral Self-Correction in Large Language Models
IJCNLP 2025
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization
NIPS 2024
Jailbreak Open-Sourced Large Language Models via Enforced Decoding
ACL 2024
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM
ACL 2024
Data Free Backdoor Attacks
NIPS 2024
Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections
NAACL 2024
IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI
NIPS 2023
Wild-Time: A Benchmark of in-the-Wild Distribution Shift over Time
NIPS 2022