Fenghua Weng
3 papers · 2025–2025 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓
π
Conference Polyglot
(3)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(11)
π
Cross-Pollinator
(15)
Conferences
AAAI (1)
ACL (1)
EMNLP (1)
Top co-authors
Keywords
jailbreak attack
(3)
vision-language model
(2)
safety alignment
(2)
adversarial training
(1)
model editing
(1)
ai safety
(1)
adversarial perturbation
(1)
vision language model
(1)
parameter update
(1)
defense mechanism
(1)
large language model
(1)
benchmark evaluation
(1)
kl-divergence regularization
(1)
direct preference optimization
(1)
multimodal learning
(1)
Papers
MMJ-Bench: A Comprehensive Study on Jailbreak Attacks and Defenses for Vision Language Models
AAAI 2025
DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing
ACL 2025
Adversary-Aware DPO: Enhancing Safety Alignment in Vision Language Models via Adversarial Training
EMNLP 2025