conftrace_

Hadas Orgad

11 papers · 2022–2025 · 6 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+6 more ↓

🐝 Cross-Pollinator (4) 🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (6) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge

🗺️ Taxonomy Completionist (18) 🌍 Conference Polyglot (6) 👥 Mega-Team (23) 🤝 Dynamic Duo (11) 💎 Century Club (11) 📈 Trend Setter

Conferences

NAACL (4) ACL (3) ICCV (1) ICLR (1) ICML (1) WACV (1)

Top co-authors

Yonatan Belinkov (11) David Bau (3) Dana Arad (3) Michael Toker (3) Aaron Mueller (2) Tal Haklay (2) Jaden Fried Fiotto-Kaufman (1) Adam Belfki (1) Zorik Gekhman (1) Aruna Sankaranarayanan (1)

Keywords

diffusion model (4) model editing (3) text encoder (2) text-to-image diffusion (2) text-to-image generation (1) mechanistic interpretability (1) image generation (1) latent representation (1) content moderation (1) bias mitigation (1) language model (1) text-to-image model (1) language model interpretability (1) parameter efficiency (1) closed-form solution (1) sentiment classification (1) algorithmic fairness (1) cross-attention layer (1) circuit discovery (1) representation learning (1)

Papers

LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations ICLR 2025 MIB: A Mechanistic Interpretability Benchmark ICML 2025 Position-aware Automatic Circuit Discovery ACL 2025 Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models NAACL 2025 Unified Concept Editing in Diffusion Models WACV 2024 Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines ACL 2024 ReFACT: Updating Text-to-Image Models by Editing the Text Encoder NAACL 2024 BLIND: Bias Removal With No Demographics ACL 2023 Editing Implicit Assumptions in Text-to-Image Diffusion Models ICCV 2023 Choose Your Lenses: Flaws in Gender Bias Evaluation NAACL 2022 How Gender Debiasing Affects Internal Model Representations, and Why It Matters NAACL 2022