Ruoyu Sun
40 papers · 2015–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
🐝 Cross-Pollinator (6) 🌍 Conference Polyglot (11) 🧭 Keyword Pioneer 🏃 Academic Marathon (10) 🌈 Renaissance Researcher (7)
🐣
Hot Topic Early Bird
🌍
Conference Polyglot
(11)
🏃
Academic Marathon
(10)
👥
Mega-Team
(22)
👑
Triple Crown
🏆
Grand Slam
🔬
Deep Specialist
(12)
🏆
Keyword Champion
(3)
🔥
Unstoppable
(8)
💎
Century Club
(38)
⚡
Prolific Year
(12)
❓
The Questioner
🗃️
Keyword Collector
(124)
Conferences
NIPS (15)
ICLR (9)
ICML (4)
ACL (2)
CVPR (2)
EACL (2)
EMNLP (2)
AAAI (1)
COLT (1)
NAACL (1)
NSDI (1)
Top co-authors
Research topics
Keywords
neural network optimization
(5)
gradient descent
(5)
generative adversarial network
(4)
neural network
(4)
convergence analysis
(3)
local minimum
(3)
large language model
(3)
generalization bound
(3)
stochastic gradient descent
(3)
binary classification
(2)
robust generalization
(2)
adversarial learning
(2)
supervised fine-tuning
(2)
iteration complexity
(2)
adversarial training
(2)
loss landscape
(2)
adam optimizer
(2)
min-max optimization
(2)
knowledge distillation
(1)
optimal transport
(1)
Papers
Rethinking Data Mixture for Large Language Models: A Comprehensive Survey and New Perspectives
EACL 2026
VCORE: Variance-Controlled Optimization-based Reweighting for Chain-of-Thought Supervision
ACL 2026
Adam-mini: Use Fewer Learning Rates To Gain More
ICLR 2025
Towards Explaining the Power of Constant-depth Graph Neural Networks for Structured Linear Programming
ICLR 2025
When GNNs meet symmetry in ILPs: an orbit-based feature augmentation approach
ICLR 2025
Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion
ACL 2025
A Middle Path for On-Premises LLM Deployment: Preserving Privacy Without Sacrificing Model Confidentiality
EMNLP 2025
Preserving Diversity in Supervised Fine-Tuning of Large Language Models
ICLR 2025
FinBPM: A Framework for Portfolio Management-based Financial Investor Behavior Perception Model
EACL 2024
SymILO: A Symmetry-Aware Learning Framework for Integer Linear Optimization
NIPS 2024
On the Power of Small-size Graph Neural Networks for Linear Programming
NIPS 2024
Why Transformers Need Adam: A Hessian Perspective
NIPS 2024
Bridging the Gap: Rademacher Complexity in Robust and Standard Generalization
COLT 2024
Unlocking Black-Box Prompt Tuning Efficiency via Zeroth-Order Optimization
EMNLP 2024
LEMON: Lossless model expansion
ICLR 2024
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
ICML 2024
PDHG-Unrolled Learning-to-Optimize Method for Large-Scale Linear Programming
ICML 2024
How Graph Neural Networks Learn: Lessons from Training Dynamics
ICML 2024
AceGPT, Localizing Large Language Models in Arabic
NAACL 2024
Empower Programmable Pipeline for Advanced Stateful Packet Processing
NSDI 2024
NTK-SAP: Improving neural network pruning by aligning training dynamics
ICLR 2023
A GNN-Guided Predict-and-Search Framework for Mixed-Integer Linear Programming
ICLR 2023
PAC-Bayesian Spectrally-Normalized Bounds for Adversarially Robust Generalization
NIPS 2023
Balanced Training for Sparse GANs
NIPS 2023
Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning
CVPR 2022
Does Momentum Change the Implicit Regularization on Separable Data?
NIPS 2022
Stability Analysis and Generalization Bounds of Adversarial Training
NIPS 2022
DigGAN: Discriminator gradIent Gap Regularization for GAN Training with Limited Data
NIPS 2022
Adam Can Converge Without Any Modification On Update Rules
NIPS 2022
PenDer: Incorporating Shape Constraints via Penalized Derivatives
AAAI 2021
RMSprop converges with proper hyper-parameter
ICLR 2021
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work
NIPS 2021
Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data
NIPS 2021
A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems
NIPS 2020
Towards a Better Global Loss Landscape of GANs
NIPS 2020
Max-Sliced Wasserstein Distance and Its Use for GANs
CVPR 2019
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization
ICLR 2019
Adding One Neuron Can Eliminate All Bad Local Minima
NIPS 2018
Understanding the Loss Surface of Neural Networks for Binary Classification
ICML 2018
Improved Iteration Complexity Bounds of Cyclic Block Coordinate Descent for Convex Problems
NIPS 2015