Jingzhao Zhang
33 papers · 2018–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (8) πΊοΈ Taxonomy Completionist (10) π Interdisciplinary Bridge π Academic Marathon (7)
πΊοΈ
Taxonomy Completionist
(10)
π§
Keyword Pioneer
π
Cross-Pollinator
(7)
π€
Dynamic Duo
(10)
π
Triple Crown
π₯
Unstoppable
(6)
π
Century Club
(33)
β‘
Prolific Year
(6)
ποΈ
Keyword Collector
(98)
β
The Questioner
(3)
Conferences
ICML (10)
NIPS (10)
ICLR (6)
COLT (3)
AISTATS (1)
EMNLP (1)
JMLR (1)
UAI (1)
Top co-authors
Keywords
bilevel optimization
(3)
stochastic gradient descent
(3)
stationary point
(3)
markov chain monte carlo
(2)
stochastic gradient
(2)
loss function
(2)
sampling algorithm
(2)
first-order method
(2)
convex optimization
(2)
reinforcement learning
(2)
nonconvex optimization
(2)
convergence analysis
(2)
nesterov acceleration
(2)
neural network optimization
(2)
first-order oracle
(2)
gradient descent
(2)
task generalization
(1)
text generation
(1)
meta-learning
(1)
function approximation
(1)
Papers
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
ICLR 2025
Understanding Nonlinear Implicit Bias via Region Counts in Input Space
ICML 2025
Task Generalization with Autoregressive Compositional Structure: Can Learning from $D$ Tasks Generalize to $D^T$ Tasks?
ICML 2025
Towards Black-Box Membership Inference Attack for Diffusion Models
ICML 2025
Generalization Lower Bounds for GD and SGD in Smooth Stochastic Convex Optimization
AISTATS 2025
Solving Convex-Concave Problems with $\mathcal{O}(\epsilon^{-4/7})$ Second-Order Oracle Complexity
COLT 2025
Fast and Multiphase Rates for Nearest Neighbor Classifiers
COLT 2025
Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles
JMLR 2025
Scalable Model Merging with Progressive Layer-wise Distillation
ICML 2025
Second-Order Min-Max Optimization with Lazy Hessians
ICLR 2025
Online Policy Optimization for Robust Markov Decision Process
UAI 2024
On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis
COLT 2024
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
ICML 2024
A Quadratic Synchronization Rule for Distributed Deep Learning
ICLR 2024
Online Control with Adversarial Disturbance for Continuous-time Linear Systems
NIPS 2024
Functionally Constrained Algorithm Solves Convex Simple Bilevel Problem
NIPS 2024
Fast Conditional Mixing of MCMC Algorithms for Non-log-concave Distributions
NIPS 2023
Benign Overfitting in Classification: Provably Counter Label Noise with Larger Models
ICLR 2023
Iteratively Learn Diverse Strategies with State Distance Information
NIPS 2023
On the Overlooked Pitfalls of Weight Decay and How to Mitigate Them: A Gradient-Norm Perspective
NIPS 2023
Understanding the unstable convergence of gradient descent
ICML 2022
Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity
ICML 2022
Efficient Sampling on Riemannian Manifolds via Langevin MCMC
NIPS 2022
Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective
ICML 2022
Coping with Label Shift via Distributionally Robust Optimisation
ICLR 2021
Complexity Lower Bounds for Nonconvex-Strongly-Concave Min-Max Optimization
NIPS 2021
Fast Federated Learning in the Presence of Arbitrary Device Unavailability
NIPS 2021
Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation?
EMNLP 2021
Provably Efficient Algorithms for Multi-Objective Competitive RL
ICML 2021
Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions
ICML 2020
Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity
ICLR 2020
Why are Adaptive Methods Good for Attention Models?
NIPS 2020
Direct Runge-Kutta Discretization Achieves Acceleration
NIPS 2018