Papers
A StrongREJECT for Empty Jailbreaks
NIPS 2024
Rapid Plug-in Defenders
NIPS 2024
How Susceptible Are LLMs to Logical Fallacies?
COLING 2024
Data Free Backdoor Attacks
NIPS 2024