Yaniv Nemcovsky
4 papers · 2025–2026 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓
🌍
Conference Polyglot
(2)
🌉
Interdisciplinary Bridge
🧭
Keyword Pioneer
🐝
Cross-Pollinator
(15)
Conferences
EMNLP (2)
AAAI (1)
ACL (1)
Top co-authors
Keywords
safety alignment
(2)
large language model
(2)
bias detection
(1)
machine unlearning
(1)
adversarial attack
(1)
language model
(1)
foundation model
(1)
latent space
(1)
jailbreak attack
(1)
fairness evaluation
(1)
linear operator
(1)
activation steering
(1)
activation space
(1)
refusal suppression
(1)
input loss landscape
(1)
prompt semantic space
(1)
residual memorization
(1)
memorization state
(1)
representation learning
(1)
neighborhood dynamics
(1)
Papers
Silenced Biases: The Dark Side LLMs Learned to Refuse
AAAI 2026
REMIND: Memorization and Unlearning in LLMs Through the Lens of Input Loss Landscapes
ACL 2026
Jailbreak Attack Initializations as Extractors of Compliance Directions
EMNLP 2025
Representing LLMs in Prompt Semantic Task Space
EMNLP 2025