Michael Backes
26 papers · 2019–2026 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π Conference Polyglot (10) π Academic Marathon (6) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (14)
π
Cross-Pollinator
(14)
π
Renaissance Researcher
(8)
πΊοΈ
Taxonomy Completionist
(43)
π₯
Mega-Team
(71)
π
Triple Crown
π€
Dynamic Duo
(16)
π
Century Club
(23)
ποΈ
Keyword Collector
(84)
β‘
Prolific Year
(9)
β
The Questioner
(3)
Conferences
ACL (6)
ICML (5)
EMNLP (4)
ICLR (4)
NIPS (2)
CVPR (1)
ICCV (1)
IJCAI (1)
NAACL (1)
WACV (1)
Top co-authors
Research topics
Keywords
large language model
(8)
adversarial attack
(5)
security vulnerability
(3)
self-supervised learning
(2)
defense mechanism
(2)
generative model
(2)
diffusion model
(2)
prompt injection
(2)
contrastive learning
(1)
graph classification
(1)
privacy attack
(1)
transformer architecture
(1)
social media analysis
(1)
prompt engineering
(1)
text classification
(1)
data poisoning
(1)
lottery ticket hypothesis
(1)
model adaptation
(1)
deep learning
(1)
deepfake detection
(1)
Papers
Pruning Unsafe Tickets: A Resource-Efficient Framework for Safer and More Robust LLMs
ACL 2026
Open SchrΓΆdingerβs Closed Box: Identifying Retrieval Augmented Generation in API-Accessible Large Language Model Services
ACL 2026
DE-CLIP: Few-Shot Anomaly Detection via Difference-Guided Embedding Editing
ACL 2026
Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media
ACL 2025
Captured by Captions: On Memorization and its Mitigation in CLIP Models
ICLR 2025
When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs
ACL 2025
JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs
ACL 2025
Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification
EMNLP 2025
Hate in Plain Sight: On the Risks of Moderating AI-Generated Hateful Illusions
ICCV 2025
SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation
ICLR 2025
Efficient and Privacy-Preserving Soft Prompt Transfer for LLMs
ICML 2025
Provably Cost-Sensitive Adversarial Defense via Randomized Smoothing
ICML 2025
Memorization in Self-Supervised Learning Improves Downstream Generalization
ICLR 2024
Open LLMs are Necessary for Current Private Adaptations and Outperform their Closed Alternatives
NIPS 2024
Reconstruct Your Previous Conversations! Comprehensively Investigating Privacy Leakage Risks in Conversations with GPT Models
EMNLP 2024
Localizing Memorization in SSL Vision Encoders
NIPS 2024
ModSCAN: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities
EMNLP 2024
The Death and Life of Great Prompts: Analyzing the Evolution of LLM Prompts from the Structural Perspective
EMNLP 2024
Composite Backdoor Attacks Against Large Language Models
NAACL 2024
Generated Distributions Are All You Need for Membership Inference Attacks Against Generative Models
WACV 2024
Position: TrustLLM: Trustworthiness in Large Language Models
ICML 2024
Generated Graph Detection
ICML 2023
Data Poisoning Attacks Against Multimodal Encoders
ICML 2023
Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders
CVPR 2023
Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning?
ICLR 2023
Fairwalk: Towards Fair Graph Embedding
IJCAI 2019