Chaowei Xiao
74 papers · 2018–2026 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
π Conference Polyglot (12) π Academic Marathon (7) π§ Keyword Pioneer π Interdisciplinary Bridge π£ Hot Topic Early Bird
π
Interdisciplinary Bridge
π£
Hot Topic Early Bird
π§
Keyword Pioneer
π€
Dynamic Duo
(23)
π
Triple Crown
π₯
Mega-Team
(71)
π
Grand Slam
π¬
Deep Specialist
(17)
π§¬
Topic Evolution
π
Conference Pioneer
ποΈ
Keyword Collector
(194)
β‘
Prolific Year
(8)
π
Century Club
(71)
π₯
Unstoppable
(8)
β
The Questioner
(3)
π
Trend Setter
Conferences
ICLR (18)
NIPS (11)
NAACL (9)
ECCV (8)
ACL (7)
ICML (7)
CVPR (5)
EMNLP (3)
ICCV (2)
AAAI (1)
CORL (1)
IJCAI (1)
WACV (1)
Top co-authors
Research topics
Keywords
large language model
(10)
adversarial attack
(6)
language model
(6)
backdoor attack
(5)
adversarial example
(5)
adversarial robustness
(5)
adversarial learning
(4)
text classification
(4)
data poisoning
(3)
adversarial training
(3)
image classification
(2)
instruction tuning
(2)
safety alignment
(2)
distribution shift
(2)
semantic segmentation
(2)
data augmentation
(2)
intellectual property
(2)
object detection
(2)
feature learning
(2)
domain adaptation
(2)
Papers
Can Editing LLMs Inject Harm?
AAAI 2026
Copyright Detective: A Forensic System to Evidence LLMs Flickering Copyright Leakage Risks
ACL 2026
Defenses Against Prompt Attacks Learn Surface Heuristics
ACL 2026
T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching
ICLR 2025
AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
ICLR 2025
Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
ICLR 2025
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations
NAACL 2025
Can Watermarks be Used to Detect LLM IP Infringement For Free?
ICLR 2025
MetaAgent: Automatically Constructing Multi-Agent Systems Based on Finite State Machines
ICML 2025
Sample-specific Noise Injection for Diffusion-based Adversarial Purification
ICML 2025
RePD: Defending Jailbreak Attack through a Retrieval-based Prompt Decomposition Process
NAACL 2025
AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection
ACL 2025
SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment
ACL 2025
PIGuard: Prompt Injection Guardrail via Mitigating Overdefense for Free
ACL 2025
Robust Representation Consistency Model via Contrastive Denoising
ICLR 2025
EIA: ENVIRONMENTAL INJECTION ATTACK ON GENERALIST WEB AGENTS FOR PRIVACY LEAKAGE
ICLR 2025
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
ICLR 2025
DataGen: Unified Synthetic Dataset Generation via Large Language Models
ICLR 2025
CVE-Bench: Benchmarking LLM-based Software Engineering Agentβs Ability to Repair Real-World CVE Vulnerabilities
NAACL 2025
LeanAgent: Lifelong Learning for Formal Theorem Proving
ICLR 2025
Instructional Fingerprinting of Large Language Models
NAACL 2024
BackdoorAlign: Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
NIPS 2024
HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection
NIPS 2024
Consistency Purification: Effective and Efficient Diffusion Purification towards Certified Robustness
NIPS 2024
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases
NIPS 2024
RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models
ACL 2024
PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees
CVPR 2024
AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting
ECCV 2024
Leveraging Hierarchical Feature Sharing for Efficient Dataset Condensation
ECCV 2024
Dolphins: Multimodal Language Model for Driving
ECCV 2024
RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios
ECCV 2024
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models
ICLR 2024
Conversational Drug Editing Using Retrieval and Domain Feedback
ICLR 2024
CALICO: Self-Supervised Camera-LiDAR Contrastive Pre-training for BEV Perception
ICLR 2024
Position: TrustLLM: Trustworthiness in Large Language Models
ICML 2024
From Shortcuts to Triggers: Backdoor Defense with Denoised PoE
NAACL 2024
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger
NAACL 2024
Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models
NAACL 2024
Combating Security and Privacy Issues in the Era of Large Language Models
NAACL 2024
Cognitive Overload: Jailbreaking Large Language Models with Overloaded Logical Thinking
NAACL 2024
Differentially Private Video Activity Recognition
WACV 2024
Retrieval-based Controllable Molecule Generation
ICLR 2023
Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
EMNLP 2023
Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency
CVPR 2023
VoxFormer: Sparse Voxel Transformer for Camera-Based 3D Semantic Scene Completion
CVPR 2023
A Critical Revisit of Adversarial Robustness in 3D Point Cloud Recognition with Diffusion-Driven Purification
ICML 2023
CodeIPPrompt: Intellectual Property Infringement Assessment of Code Language Models
ICML 2023
On the Exploitability of Instruction Tuning
NIPS 2023
Defending against Insertion-based Textual Backdoor Attacks via Attribution
ACL 2023
HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence Embeddings
EMNLP 2023
Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
EMNLP 2023
DensePure: Understanding Diffusion Models for Adversarial Robustness
ICLR 2023
Defending against Adversarial Audio via Diffusion Model
ICLR 2023
Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models
NIPS 2022
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
NIPS 2022
RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning
ICLR 2022
Understanding The Robustness in Vision Transformers
ICML 2022
AdvDO: Realistic Adversarial Attacks for Trajectory Prediction
ECCV 2022
SecretGen: Privacy Recovery on Pre-trained Models via Distribution Discrimination
ECCV 2022
Robust Trajectory Prediction against Adversarial Attacks
CORL 2022
Diffusion Models for Adversarial Purification
ICML 2022
Long-Short Transformer: Efficient Transformers for Language and Vision
NIPS 2021
Can Shape Structure Features Improve Model Robustness Under Diverse Adversarial Settings?
ICCV 2021
Adversarially Robust 3D Point Cloud Recognition Using Self-Supervisions
NIPS 2021
AugMax: Adversarial Composition of Random Augmentations for Robust Training
NIPS 2021
Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations
NIPS 2020
SemanticAdv: Generating Adversarial Examples via Attribute-conditioned Image Editing
ECCV 2020
Towards Stable and Efficient Training of Verifiably Robust Neural Networks
ICLR 2020
AdvIT: Adversarial Frames Identifier Based on Temporal Consistency in Videos
ICCV 2019
MeshAdv: Adversarial Meshes for Visual Recognition
CVPR 2019
Characterizing Adversarial Examples Based on Spatial Consistency Information for Semantic Segmentation
ECCV 2018
Spatially Transformed Adversarial Examples
ICLR 2018
Robust Physical-World Attacks on Deep Learning Visual Classification
CVPR 2018
Generating Adversarial Examples with Adversarial Networks
IJCAI 2018