Nicolas Flammarion

53 papers · 2015–2025 · 9 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🗺️ Taxonomy Completionist (10) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🐣 Hot Topic Early Bird

🌍 Conference Polyglot (9) 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🔬 Deep Specialist (19) 👑 Triple Crown 🏆 Keyword Champion (5) 🤝 Dynamic Duo (15) 🏆 Grand Slam 🗃️ Keyword Collector (196) ❓ The Questioner (3) ⚡ Prolific Year (10) 💎 Century Club (53) 🔥 Unstoppable (9) 📈 Trend Setter

Conferences

NIPS (19) COLT (10) ICML (9) ICLR (6) JMLR (4) AISTATS (2) AAAI (1) ECCV (1) UAI (1)

Top co-authors

Maksym Andriushchenko (15) Francesco Croce (8) Scott Pesme (7) Etienne Boursier (6) Loucas Pillaud-Vivien (5) Francis Bach (5) Aditya Varre (4) Oğuz Kaan Yüksel (3) Matthias Hein (3) Mathieu Even (3)

Keywords

stochastic gradient descent (14) implicit bia (6) neural network (6) implicit regularization (5) gradient flow (5) diagonal linear network (4) gradient descent (4) convergence rate (3) convex optimization (3) least squares regression (3) relu network (2) adversarial robustness (2) saddle point (2) linear network (2) adversarial attack (2) riemannian manifold (2) sharpness-aware minimization (2) neural network optimization (2) sparse representation (2) stochastic optimization (2)

Papers

On the Sample Complexity of Next-Token Prediction AISTATS 2025 Learning In-context $n$-grams with Transformers: Sub-$n$-grams Are Near-Stationary Points ICML 2025 Learning Parametric Distributions from Samples and Preferences ICML 2025 Learning Algorithms in the Limit COLT 2025 Selective Induction Heads: How Transformers Select Causal Structures in Context ICLR 2025 Simplicity Bias and Optimization Threshold in Two-Layer ReLU Networks ICML 2025 Is In-Context Learning Sufficient for Instruction Following in LLMs? ICLR 2025 Long-Context Linear System Identification ICLR 2025 Early Alignment in Two-Layer Networks Training is a Two-Edged Sword JMLR 2025 Does Refusal Training in LLMs Generalize to the Past Tense? ICLR 2025 Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks ICLR 2025 Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning ICML 2024 First-order ANIL provably learns representations despite overparametrisation ICLR 2024 Why Do We Need Weight Decay in Modern Deep Learning? NIPS 2024 JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models NIPS 2024 SGD vs GD: Rank Deficiency in Linear Networks NIPS 2024 Implicit Bias of Mirror Flow on Separable Data NIPS 2024 Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks AISTATS 2024 Linearization Algorithms for Fully Composite Optimization COLT 2023 Sharpness-Aware Minimization Leads to Low-Rank Features NIPS 2023 Quantum Channel Certification with Incoherent Measurements COLT 2023 (S)GD over Diagonal Linear Networks: Implicit bias, Large Stepsizes and Edge of Stability NIPS 2023 Transferable Adversarial Robustness for Categorical Data via Universal Robust Embeddings NIPS 2023 Penalising the biases in norm regularisation enforces sparsity NIPS 2023 On the spectral bias of two-layer linear networks NIPS 2023 Saddle-to-Saddle Dynamics in Diagonal Linear Networks NIPS 2023 A Modern Look at the Relationship between Sharpness and Generalization ICML 2023 SGD with Large Step Sizes Learns Sparse Features ICML 2023 On the effectiveness of adversarial training against common corruptions UAI 2022 Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs NIPS 2022 Sparse-RS: A Versatile Framework for Query-Efficient Sparse Black-Box Adversarial Attacks AAAI 2022 Trace norm regularization for multi-task learning with scarce data COLT 2022 Accelerated SGD for Non-Strongly-Convex Least Squares COLT 2022 Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation COLT 2022 Towards Understanding Sharpness-Aware Minimization ICML 2022 An Efficient Sampling Algorithm for Non-smooth Composite Potentials JMLR 2022 Last iterate convergence of SGD for Least-Squares in the Interpolation regime. NIPS 2021 Sequential Algorithms for Testing Closeness of Distributions NIPS 2021 Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity NIPS 2021 Continuized Accelerations of Deterministic and Stochastic Gradient Descents, and of Gossip Algorithms NIPS 2021 On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent ICML 2020 Square Attack: a query-efficient black-box adversarial attack via random search ECCV 2020 Online Robust Regression via SGD on the l1 loss NIPS 2020 Understanding and Improving Fast Adversarial Training NIPS 2020 Fast Mean Estimation with Sub-Gaussian Rates COLT 2019 Escaping from saddle points on Riemannian manifolds NIPS 2019 On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo ICML 2018 Gen-Oja: Simple & Efficient Algorithm for Streaming Generalized Eigenvector Computation NIPS 2018 Averaging Stochastic Gradient Descent on Riemannian Manifolds COLT 2018 Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression JMLR 2017 Robust Discriminative Clustering with Sparse Regularizers JMLR 2017 Stochastic Composite Least-Squares Regression with Convergence Rate $O(1/n)$ COLT 2017 From Averaging to Acceleration, There is Only a Step-size COLT 2015