Zhao Song

107 papers · 2016–2026 · 12 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🗺️ Taxonomy Completionist (25) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🐣 Hot Topic Early Bird

🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (25) 🧭 Keyword Pioneer 🏠 Conference Loyalist (29) 🤝 Dynamic Duo (15) 👑 Triple Crown 🏆 Grand Slam 🔬 Deep Specialist (19) 🏆 Keyword Champion (6) ⚡ Prolific Year (13) 🔥 Unstoppable (11) ❓ The Questioner (5) 💎 Century Club (107) 📈 Trend Setter 🗃️ Keyword Collector (81)

Conferences

ICML (30) NIPS (29) AISTATS (15) ICLR (12) AAAI (6) UAI (5) EMNLP (4) WACV (2) COLT (1) ICCV (1) IJCAI (1) JMLR (1)

Top co-authors

Yingyu Liang (15) Zhenmei Shi (14) Lichen Zhang (13) Junze Yin (9) David Woodruff (9) Xiaoyu Li (7) Christopher Re (6) Beidi Chen (5) Zhizhou Sha (5) Bo Chen (5)

Research topics

Optimization & Theory (2)

Keywords

low-rank approximation (8) neural network (7) attention mechanism (6) neural tangent kernel (6) federated learning (6) large language model (5) low rank approximation (5) matrix factorization (5) gradient descent (4) relu network (4) convergence analysis (4) computational complexity (4) convergence guarantee (4) differential privacy (3) kernel regression (3) sample complexity (3) matrix approximation (3) model compression (3) empirical risk minimization (3) convex optimization (3)

Papers

T2VWorldBench: A Benchmark for Evaluating World Knowledge in Text-to-Video Generation WACV 2026 Discrepancy Minimization in Input-Sparsity Time ICML 2025 LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers AAAI 2025 Numerical Pruning for Efficient Autoregressive Models AAAI 2025 Fourier Circuits in Neural Networks and Transformers: A Case Study of Modular Arithmetic with Multiple Inputs AISTATS 2025 An Iterative Algorithm for Rescaled Hyperbolic Functions Regression AISTATS 2025 Looped ReLU MLPs May Be All You Need as Practical Programmable Computers AISTATS 2025 When Can We Solve the Weighted Low Rank Approximation Problem in Truly Subquadratic Time? AISTATS 2025 Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent AISTATS 2025 Circuit Complexity Bounds for RoPE-based Transformer Architecture EMNLP 2025 Towards Infinite-Long Prefix in Transformer EMNLP 2025 Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers EMNLP 2025 Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective ICCV 2025 Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix ICLR 2025 Faster Algorithms for Structured Linear and Kernel Support Vector Machines ICLR 2025 Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation ICLR 2025 Computational Limits of Low-Rank Adaptation (LoRA) Fine-Tuning for Transformer Models ICLR 2025 Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency ICLR 2025 Dissecting Submission Limit in Desk-Rejections: A Mathematical Analysis of Fairness in AI Conference Policies ICML 2025 Fundamental Limits of Visual Autoregressive Transformers: Universal Approximation Abilities ICML 2025 On Differential Privacy for Adaptively Solving Search Problems via Sketching ICML 2025 Binary Hypothesis Testing for Softmax Models and Leverage Score Models ICML 2025 Deterministic Sparse Fourier Transform for Continuous Signals with Frequency Gap ICML 2025 In-Context Deep Learning via Transformer Models ICML 2025 NRFlow: Towards Noise-Robust Generative Modeling via High-Order Mechanism UAI 2025 A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time UAI 2025 Dynamic Maintenance of Kernel Density Estimation Data Structure: From Practice to Theory UAI 2025 Differential Privacy Mechanisms in Neural Tangent Kernel Regression WACV 2025 A General Algorithm for Solving Rank-one Matrix Sensing AISTATS 2024 A Sublinear Adversarial Training Algorithm ICLR 2024 On Socially Fair Low-Rank Approximation and Column Subset Selection NIPS 2024 The Closeness of In-Context Learning and Weight Shifting for Softmax Regression NIPS 2024 The Fine-Grained Complexity of Gradient Computation for Training Large Language Models NIPS 2024 Metric Transforms and Low Rank Representations of Kernels for Fast Attention NIPS 2024 Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models ICML 2024 On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis ICML 2024 On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs) NIPS 2024 On Convergence of Federated Averaging Langevin Dynamics UAI 2024 How to Capture Higher-order Correlations? Generalizing Matrix Softmax Attention to Kronecker Computation ICLR 2024 Low Rank Matrix Completion via Robust Alternating Minimization in Nearly Linear Time ICLR 2024 Log-concave Sampling from a Convex Body with a Barrier: a Robust and Unified Dikin Walk NIPS 2024 How to Protect Copyright Data in Optimization of Large Language Models? AAAI 2024 Solving Attention Kernel Regression Problem via Pre-conditioner AISTATS 2024 Fast Dynamic Sampling for Determinantal Point Processes AISTATS 2024 Federated Adversarial Learning: A Framework with Convergence Analysis ICML 2023 Emergence of Punishment in Social Dilemma with Environmental Feedback AAAI 2023 Smoothed Online Combinatorial Optimization Using Imperfect Predictions AAAI 2023 An Online and Unified Algorithm for Projection Matrix Vector Multiplication with Application to Empirical Risk Minimization AISTATS 2023 A Tale of Two Efficient Value Iteration Algorithms for Solving Linear MDPs with Large Action Space AISTATS 2023 Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time ICML 2023 Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability ICML 2023 Sketching Meets Differential Privacy: Fast Algorithm for Dynamic Kronecker Projection Maintenance ICML 2023 A Nearly-Optimal Bound for Fast Regression with $\ell_∞$ Guarantee ICML 2023 Exact Representation of Sparse Networks with Symmetric Nonnegative Embeddings NIPS 2023 H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models NIPS 2023 Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing NIPS 2023 InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding NIPS 2023 Fast Attention Requires Bounded Entries NIPS 2023 FITNESS: (Fine Tune on New and Similar Samples) to detect anomalies in streams with drift and outliers ICML 2022 One-Pass Algorithms for MAP Inference of Nonsymmetric Determinantal Point Processes ICML 2022 Fast Distance Oracles for Any Symmetric Norm NIPS 2022 Bounding the Width of Neural Networks via Coupled Initialization A Worst Case Analysis ICML 2022 Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models ICLR 2022 Dynamic Tensor Product Regression NIPS 2022 Perfectly Balanced: Improving Transfer and Robustness of Supervised Contrastive Learning ICML 2022 Fast Graph Neural Tangent Kernel via Kronecker Sketching AAAI 2022 MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training ICLR 2021 When is particle filtering efficient for planning in partially observed linear dynamical systems? UAI 2021 Scatterbrain: Unifying Sparse and Low-rank Attention NIPS 2021 FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Analysis ICML 2021 Fast Sketching of Polynomial Kernels of Polynomial Degree ICML 2021 Oblivious Sketching-based Central Path Method for Linear Programming ICML 2021 On InstaHide, Phase Retrieval, and Sparse Matrix Factorization ICLR 2021 Evaluating Gradient Inversion Attacks and Defenses in Federated Learning NIPS 2021 Breaking the Linear Iteration Cost Barrier for Some Well-known Conditional Gradient Methods Using MaxIP Data-structures NIPS 2021 Does Preprocessing Help Training Over-parameterized Neural Networks? NIPS 2021 TextHide: Tackling Data Privacy in Language Understanding Tasks EMNLP 2020 Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality NIPS 2020 Generalized Leverage Score Sampling for Neural Networks NIPS 2020 Sketching Transformed Matrices with Applications to Natural Language Processing AISTATS 2020 InstaHide: Instance-hiding Schemes for Private Distributed Learning ICML 2020 Meta-learning for Mixed Linear Regression ICML 2020 Non-Autoregressive Neural Text-to-Speech ICML 2020 WaveFlow: A Compact Flow-based Model for Raw Audio ICML 2020 Towards a Zero-One Law for Column Subset Selection NIPS 2019 Towards a Theoretical Understanding of Hashing-Based Neural Nets AISTATS 2019 Non-Convex Matrix Completion and Related Problems via Strong Duality JMLR 2019 A Convergence Theory for Deep Learning via Over-Parameterization ICML 2019 Revisiting the Softmax Bellman Operator: New Benefits and New Perspective ICML 2019 Optimal Sketching for Kronecker Product Regression and Low Rank Approximation NIPS 2019 On the Convergence Rate of Training Recurrent Neural Networks NIPS 2019 Total Least Squares Regression in Input Sparsity Time NIPS 2019 Efficient Symmetric Norm Regression via Linear Sketching NIPS 2019 Average Case Column Subset Selection for Entrywise $\ell_1$-Norm Loss NIPS 2019 Provable Non-linear Inductive Matrix Completion NIPS 2019 Solving Empirical Risk Minimization in the Current Matrix Multiplication Time COLT 2019 The Limitations of Adversarial Training and the Blind-Spot Attack ICLR 2019 Stochastic Multi-armed Bandits in Constant Space AISTATS 2018 Towards Fast Computation of Certified Robustness for ReLU Networks ICML 2018 Sketching for Kronecker Product Regression and P-splines AISTATS 2018 Learning Long Term Dependencies via Fourier Recurrent Units ICML 2018 Scalable Model Selection for Belief Networks NIPS 2017 Recovery Guarantees for One-hidden-layer Neural Networks ICML 2017 Sublinear Time Orthogonal Tensor Decomposition NIPS 2016 Maximum Sustainable Yield Problem for Robot Foraging and Construction System IJCAI 2016 Learning Sigmoid Belief Networks via Monte Carlo Expectation Maximization AISTATS 2016 Linear Feature Encoding for Reinforcement Learning NIPS 2016