conftrace_

Sergey Berezin

4 papers · 2022–2025 · 3 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+1 more ↓

🌍 Conference Polyglot (3) 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (13) 🧭 Keyword Pioneer

🐝 Cross-Pollinator (15)

Conferences

ACL (2) COLING (1) EMNLP (1)

Top co-authors

Reza Farahbakhsh (3) Noel Crespi (3) Tatiana Batura (1)

Research topics

Keywords

adversarial attack (3) large language model (2) toxicity detection (2) adversarial defense (1) adversarial example (1) jailbreak attack (1) prompt injection (1) multilingual model (1) abstractive summarization (1) moderation system (1) obfuscation attack (1) neural network (1) sentence-level attack (1) pretraining objective (1) task-in-prompt attack (1) jailbreak adversarial attack (1) named entity recognition (1) ascii art (1) text summarization (1) safety alignment (1)

Papers

The TIP of the Iceberg: Revealing a Hidden Class of Task-in-Prompt Adversarial Attacks on LLMs ACL 2025 Evading Toxicity Detection with ASCII-art: A Benchmark of Spatial Attacks on Moderation Systems ACL 2025 No offence, Bert - I insult only humans! Multilingual sentence-level attack on toxicity detection networks EMNLP 2023 Named Entity Inclusion in Abstractive Text Summarization COLING 2022