Miao Lu
13 papers · 2021–2026 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+6 more ↓ Show less ↑
π Renaissance Researcher (7) π Cross-Pollinator (13) π Interdisciplinary Bridge π Conference Polyglot (5) πΊοΈ Taxonomy Completionist (26)
π
Interdisciplinary Bridge
π
Cross-Pollinator
(13)
π
Grand Slam
π₯
Unstoppable
(5)
π
Century Club
(12)
β
The Questioner
Conferences
NIPS (5)
ICLR (4)
AAAI (1)
ACL (1)
CVPR (1)
ICML (1)
Top co-authors
Keywords
sublinear regret
(2)
reinforcement learning
(2)
policy gradient
(1)
offline reinforcement learning
(1)
language model alignment
(1)
object detection
(1)
knowledge transfer
(1)
robot control
(1)
sample complexity
(1)
human-object interaction
(1)
direct preference optimization
(1)
reinforcement learning from human feedback
(1)
markov decision process
(1)
model-based reinforcement learning
(1)
sample efficiency
(1)
distributionally robust
(1)
distributional robustness
(1)
interaction detection
(1)
regret bound
(1)
multi-agent reinforcement learning
(1)
Papers
Beyond the Context Window: Scaling Agentic RL via End-to-end Optimized Context Compression
ACL 2026
Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model
ICLR 2025
Benign Oscillation of Stochastic Gradient Descent with Large Learning Rate
ICLR 2024
Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithms
NIPS 2024
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer
NIPS 2024
Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage
NIPS 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
NIPS 2023
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
ICLR 2023
GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection
CVPR 2022
Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization
AAAI 2022
Learning Pruning-Friendly Networks via Frank-Wolfe: One-Shot, Any-Sparsity, And No Retraining
ICLR 2022
Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy
ICML 2022
Mining the Benefits of Two-stage and One-stage HOI Detection
NIPS 2021