conftrace_

Yaniv Nemcovsky

4 papers · 2025–2026 · 3 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

🌍 Conference Polyglot (2) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (15)

Conferences

EMNLP (2) AAAI (1) ACL (1)

Top co-authors

Avi Mendelson (4) Rom Himelstein (2) Amit Levi (2) Idan Kashani (1) Chaim Baskin (1) Liran Cohen (1) Brit Youngmann (1)

Keywords

safety alignment (2) large language model (2) bias detection (1) machine unlearning (1) adversarial attack (1) language model (1) foundation model (1) latent space (1) jailbreak attack (1) fairness evaluation (1) linear operator (1) activation steering (1) activation space (1) refusal suppression (1) input loss landscape (1) prompt semantic space (1) residual memorization (1) memorization state (1) representation learning (1) neighborhood dynamics (1)

Papers

Silenced Biases: The Dark Side LLMs Learned to Refuse AAAI 2026 REMIND: Memorization and Unlearning in LLMs Through the Lens of Input Loss Landscapes ACL 2026 Jailbreak Attack Initializations as Extractors of Compliance Directions EMNLP 2025 Representing LLMs in Prompt Semantic Task Space EMNLP 2025