Jiongxiao Wang
13 papers · 2022–2026 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+5 more ↓ Show less ↑
π£ Hot Topic Early Bird π Conference Polyglot (5) π Cross-Pollinator (11) πΊοΈ Taxonomy Completionist (10) π§ Keyword Pioneer
π
Interdisciplinary Bridge
π
Triple Crown
π
Keyword Champion
(2)
π€
Dynamic Duo
(11)
π
Century Club
(12)
Conferences
ICLR (5)
NIPS (3)
ACL (2)
ICML (2)
NAACL (1)
Top co-authors
Keywords
large language model
(3)
diffusion purification
(2)
adversarial robustness
(2)
adversarial attack
(2)
reinforcement learning
(2)
ai safety
(1)
computational efficiency
(1)
instruction tuning
(1)
safety alignment
(1)
model alignment
(1)
projected gradient descent
(1)
human feedback
(1)
backdoor attack
(1)
image manifold
(1)
stochastic differential equation
(1)
consistency model
(1)
adversarial example
(1)
fine-tuning attack
(1)
3d point cloud
(1)
jailbreak defense
(1)
Papers
Reinforcement Learning for Self-Improving Agent with Skill Library
ACL 2026
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations
NAACL 2025
Robust Representation Consistency Model via Contrastive Denoising
ICLR 2025
Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
ICLR 2025
Conversational Drug Editing Using Retrieval and Domain Feedback
ICLR 2024
BackdoorAlign: Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
NIPS 2024
Consistency Purification: Effective and Efficient Diffusion Purification towards Certified Robustness
NIPS 2024
RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models
ACL 2024
On the Exploitability of Instruction Tuning
NIPS 2023
A Critical Revisit of Adversarial Robustness in 3D Point Cloud Recognition with Diffusion-Driven Purification
ICML 2023
Defending against Adversarial Audio via Diffusion Model
ICLR 2023
DensePure: Understanding Diffusion Models for Adversarial Robustness
ICLR 2023
Fast and Reliable Evaluation of Adversarial Robustness with Minimum-Margin Attack
ICML 2022