conftrace_

Tinghao Xie

8 papers · 2022–2025 · 3 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

🌍 Conference Polyglot (3) 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (15)

Conferences

ICLR (6) CVPR (1) ICML (1)

Top co-authors

Xiangyu Qi (7) Prateek Mittal (6) Peter Henderson (5) Yangsibo Huang (4) Boyi Wei (3) Luxi He (3) Ruoxi Jia (2) Yi Zeng (2) Kaixuan Huang (2) Yiming Li (2)

Keywords

data poisoning (1) backdoor attack (1) adversarial attack (1) deep neural network (1) model deployment (1) weight attack (1)

Papers

SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal ICLR 2025 On Evaluating the Durability of Safeguards for Open-Weight LLMs ICLR 2025 Fantastic Copyrighted Beasts and How (Not) to Generate Them ICLR 2025 Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications ICML 2024 Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! ICLR 2024 BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection ICLR 2024 Revisiting the Assumption of Latent Separability for Backdoor Defenses ICLR 2023 Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks CVPR 2022