Co-occurring keywords
Papers
BeDKD: Backdoor Defense Based on Directional Mapping Module and Adversarial Knowledge Distillation
AAAI 2026
EigenShield: Inference-Time, Model-Agnostic Jailbreaking Defense via Causal Subspace Filtering
AAAI 2026
Towards Effective, Stealthy, and Persistent Backdoor Attacks Targeting Graph Foundation Models
AAAI 2026
From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models
ACL 2025
SABER: Uncovering Vulnerabilities in Safety Alignment via Cross-Layer Residual Connection
EMNLP 2025
Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models
CVPR 2025
Safety in Large Reasoning Models: A Survey
EMNLP 2025