conftrace_

Kaifeng Lyu

22 papers · 2019–2025 · 3 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+7 more ↓ 🐝 Cross-Pollinator (12) πŸƒ Academic Marathon (6) 🧭 Keyword Pioneer 🌍 Conference Polyglot (3) πŸŒ‰ Interdisciplinary Bridge
πŸ—ΊοΈ Taxonomy Completionist (18) 🧭 Keyword Pioneer 🀝 Dynamic Duo (10) ❓ The Questioner ⚑ Prolific Year (7) πŸ’Ž Century Club (22) πŸ”₯ Unstoppable (7)

Conferences

ICLR (14) NIPS (6) ICML (2)

Papers

RNNs are not Transformers (Yet): The Key Bottleneck on In-Context Retrieval ICLR 2025 A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules ICLR 2025 Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias ICLR 2025 Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks ICLR 2025 Safety Alignment Should be Made More Than Just a Few Tokens Deep ICLR 2025 Weak-to-Strong Generalization Even in Random Feature Networks, Provably ICML 2025 Efficient stagewise pretraining via progressive subnetworks ICLR 2025 Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking ICLR 2024 Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates NIPS 2024 A Quadratic Synchronization Rule for Distributed Deep Learning ICLR 2024 The Marginal Value of Momentum for Small Learning Rate SGD ICLR 2024 DistillSpec: Improving Speculative Decoding via Knowledge Distillation ICLR 2024 Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing ICML 2023 Why (and When) does Local SGD Generalize Better than SGD? ICLR 2023 Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction NIPS 2022 New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound NIPS 2022 On the SDEs and Scaling Rules for Adaptive Gradient Algorithms NIPS 2022 Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning ICLR 2021 Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias NIPS 2021 Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate NIPS 2020 Gradient Descent Maximizes the Margin of Homogeneous Neural Networks ICLR 2020 Theoretical Analysis of Auto Rate-Tuning by Batch Normalization ICLR 2019