David Bau

34 papers · 2017–2026 · 10 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🏃 Academic Marathon (9) 🌍 Conference Polyglot (10) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (8)

🧭 Keyword Pioneer 🐝 Cross-Pollinator (8) 🌍 Conference Polyglot (10) 👥 Mega-Team (23) 🤝 Dynamic Duo (13) 🔬 Deep Specialist (10) 🧬 Topic Evolution 🏆 Keyword Champion (2) ❓ The Questioner 📈 Trend Setter 🗃️ Keyword Collector (95) 🔥 Unstoppable (10) ⚡ Prolific Year (7) 💎 Century Club (34) 🚀 Conference Pioneer

Conferences

ICLR (9) ICCV (5) CVPR (4) ECCV (4) NIPS (4) EMNLP (3) WACV (2) ACL (1) CONLL (1) ICML (1)

Top co-authors

Antonio Torralba (13) Yonatan Belinkov (8) Rohit Gandikota (5) Aaron Mueller (5) Jun-Yan Zhu (5) Joanna Materzynska (5) Arnab Sen Sharma (5) Bolei Zhou (4) Jacob Andreas (4) Tal Haklay (4)

Keywords

language model (5) generative adversarial network (4) diffusion model (4) causal intervention (3) image generation (3) large language model (2) latent space (2) token prediction (2) language model interpretability (2) model editing (2) mechanistic interpretability (2) representation learning (2) mode collapse (2) image synthesis (1) feature embedding (1) visual processing (1) content moderation (1) named entity recognition (1) feature extraction (1) semantic segmentation (1)

Papers

Distilling Diversity and Control in Diffusion Models WACV 2026 MIB: A Mechanistic Interpretability Benchmark ICML 2025 Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models ICLR 2025 Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare EMNLP 2025 Position-aware Automatic Circuit Discovery ACL 2025 NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals ICLR 2025 SliderSpace: Decomposing the Visual Capabilities of Diffusion Models ICCV 2025 Linearity of Relation Decoding in Transformer Language Models ICLR 2024 Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models NIPS 2024 Unified Concept Editing in Diffusion Models WACV 2024 Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking ICLR 2024 Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models ECCV 2024 Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs EMNLP 2024 Function Vectors in Large Language Models ICLR 2024 Future Lens: Anticipating Subsequent Tokens from a Single Hidden State EMNLP 2023 Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task ICLR 2023 Mass-Editing Memory in a Transformer ICLR 2023 Future Lens: Anticipating Subsequent Tokens from a Single Hidden State CONLL 2023 FIND: A Function Description Benchmark for Evaluating Interpretability Methods NIPS 2023 Erasing Concepts from Diffusion Models ICCV 2023 Natural Language Descriptions of Deep Visual Features ICLR 2022 Locating and Editing Factual Associations in GPT NIPS 2022 Disentangling Visual and Written Concepts in CLIP CVPR 2022 Editing a classifier by rewriting its prediction rules NIPS 2021 Sketch Your Own GAN ICCV 2021 Toward a Visual Concept Vocabulary for GAN Latent Space ICCV 2021 What makes fake images detectable? Understanding properties that generalize ECCV 2020 Rewriting a Deep Generative Model ECCV 2020 Diverse Image Generation via Self-Conditioned GANs CVPR 2020 Learning Words by Drawing Images CVPR 2019 Seeing What a GAN Cannot Generate ICCV 2019 GAN Dissection: Visualizing and Understanding Generative Adversarial Networks ICLR 2019 Interpretable Basis Decomposition for Visual Explanation ECCV 2018 Network Dissection: Quantifying Interpretability of Deep Visual Representations CVPR 2017