Thomas Hofmann

72 papers · 2003–2025 · 14 conferences · across top CS/AI conferences

Achievements

+16 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (15) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (14)

🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (15) 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (5) 🏆 Grand Slam 👑 Triple Crown 🤝 Dynamic Duo (20) 🔬 Deep Specialist (13) 🏆 Keyword Champion (4) 🚀 Conference Pioneer ⚡ Prolific Year (6) 🗃️ Keyword Collector (260) 📈 Trend Setter ❓ The Questioner 💎 Century Club (72) 🔥 Unstoppable (12)

Conferences

NIPS (19) ICML (15) ICLR (10) AISTATS (6) EMNLP (6) ACL (3) ICCV (3) AAAI (2) CVPR (2) JMLR (2) CONLL (1) ECCV (1) IJCNLP (1) WACV (1)

Top co-authors

Aurelien Lucchi (20) Gregor Bachmann (12) Lorenzo Noci (11) Sotiris Anagnostidis (8) Hadi Daneshmand (7) Sidak Pal Singh (6) Antonio Orvieto (6) Bobby He (5) Jonas Köhler (5) Dario Pavllo (5)

Keywords

stochastic gradient descent (5) neural network optimization (4) language model (4) hessian matrix (4) loss landscape (3) text generation (3) generative adversarial network (3) diffusion model (3) neural network (3) bayesian neural network (3) variance reduction (2) latent state space (2) gradient descent (2) adaptive sampling (2) communication efficiency (2) text-to-image generation (2) neural tangent kernel (2) model architecture (2) distributed optimization (2) batch normalization (2)

Papers

Causal Estimation of Tokenisation Bias ACL 2025 LIME: Localized Image Editing via Attention Regularization in Diffusion Models WACV 2025 The Importance of Being Lazy: Scaling Limits of Continual Learning ICML 2025 Scalable Non-Equivariant 3D Molecule Generation via Rotational Alignment ICML 2025 Emergence of Globally Attracting Fixed Points in Deep Neural Networks With Nonlinear Activations AISTATS 2025 UIP2P: Unsupervised Instruction-based Image Editing via Edit Reversibility Constraint ICCV 2025 Generalized Interpolating Discrete Diffusion ICML 2025 The Directionality of Optimization Trajectories in Neural Networks ICLR 2025 On the Expressiveness and Length Generalization of Selective State Space Models on Regular Languages AAAI 2025 LoRACLR: Contrastive Adaptation for Customization of Diffusion Models CVPR 2025 Transformer Fusion with Optimal Transport ICLR 2024 On the Effect of (Near) Duplicate Subwords in Language Modelling ACL 2024 Causal Estimation of Memorisation Profiles ACL 2024 Recurrent Distance Filtering for Graph Representation Learning ICML 2024 Navigating Scaling Laws: Compute Optimality in Adaptive Model Training ICML 2024 How Good is a Single Basin? AISTATS 2024 Local and Global Decoding in Text Generation EMNLP 2024 Simplifying Transformer Blocks ICLR 2024 Towards Meta-Pruning via Optimal Transport ICLR 2024 A Language Model’s Guide Through Latent Space ICML 2024 Understanding and Minimising Outlier Features in Transformer Training NIPS 2024 Super Consistency of Neural Network Landscapes and Learning Rate Transfer NIPS 2024 Scaling MLPs: A Tale of Inductive Bias NIPS 2023 The Hessian perspective into the Nature of Convolutional Neural Networks ICML 2023 Random Teachers are Good Teachers ICML 2023 The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit NIPS 2023 Achieving a Better Stability-Plasticity Trade-Off via Auxiliary Networks in Continual Learning CVPR 2023 Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers NIPS 2023 The Curious Case of Benign Memorization ICLR 2023 FIGARO: Controllable Music Generation using Learned and Expert Features ICLR 2023 Mastering Spatial Graph Prediction of Road Networks ICCV 2023 Vanishing Curvature in Randomly Initialized Deep ReLU Networks AISTATS 2022 OpenFilter: A Framework to Democratize Research Access to Social Media AR Filters NIPS 2022 Decoding a Neural Retriever’s Latent Space for Query Suggestion EMNLP 2022 Phenomenology of Double Descent in Finite-Width Neural Networks ICLR 2022 Generalization Through the Lens of Leave-One-Out Error ICLR 2022 How Tempering Fixes Data Augmentation in Bayesian Neural Networks ICML 2022 Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect NIPS 2021 Learning Generative Models of Textured 3D Meshes From Real-World Images ICCV 2021 Precise characterization of the prior predictive distribution of deep ReLU networks NIPS 2021 Analytic Insights into Structure and Rank of Neural Network Hessian Maps NIPS 2021 Uniform Convergence, Adversarial Spheres and a Simple Remedy ICML 2021 Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization AISTATS 2021 Convolutional Generation of Textured 3D Meshes NIPS 2020 Adversarial Training is a Form of Data-dependent Operator Norm Regularization NIPS 2020 Batch normalization provably avoids ranks collapse for randomly initialised deep networks NIPS 2020 Controlling Style and Semantics in Weakly-Supervised Image Generation ECCV 2020 LeDeepChef Deep Reinforcement Learning Agent for Families of Text-Based Games AAAI 2020 Local Saddle Point Optimization: A Curvature Exploitation Approach AISTATS 2019 Autoregressive Text Generation Beyond Feedback Loops EMNLP 2019 A Domain Agnostic Measure for Monitoring and Evaluating GANs NIPS 2019 The Odds are Odd: A Statistical Test for Detecting Adversarial Examples ICML 2019 Autoregressive Text Generation Beyond Feedback Loops IJCNLP 2019 Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization AISTATS 2019 Learning and Evaluating Sparse Interpretable Sentence Embeddings EMNLP 2018 End-to-End Neural Entity Linking CONLL 2018 An Online Learning Approach to Generative Adversarial Networks ICLR 2018 Semantic Interpolation in Implicit Models ICLR 2018 Escaping Saddles with Stochastic Gradients ICML 2018 A Distributed Second-Order Algorithm You Can Trust ICML 2018 Hyperbolic Entailment Cones for Learning Hierarchical Embeddings ICML 2018 Hyperbolic Neural Networks NIPS 2018 Deep State Space Models for Unconditional Word Generation NIPS 2018 Stabilizing Training of Generative Adversarial Networks through Regularization NIPS 2017 Deep Joint Entity Disambiguation with Local Neural Attention EMNLP 2017 Starting Small - Learning with Adaptive Sample Sizes ICML 2016 Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy NIPS 2016 Variance Reduced Stochastic Gradient Descent with Neighbors NIPS 2015 Communication-Efficient Distributed Dual Coordinate Ascent NIPS 2014 Large Margin Methods for Structured and Interdependent Output Variables JMLR 2005 Introduction to the Special Issue on Machine Learning Methods for Text and Images JMLR 2003 Investigating Loss Functions and Optimization Methods for Discriminative Learning of Label Sequences EMNLP 2003