Mengdi Wang
106 papers · 2016–2026 · 17 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (24) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (6) π£ Hot Topic Early Bird
π
Renaissance Researcher
(6)
π£
Hot Topic Early Bird
πΊοΈ
Taxonomy Completionist
(24)
π
Conference Loyalist
(27)
π
Keyword Champion
(2)
π
Triple Crown
π€
Dynamic Duo
(15)
π¬
Deep Specialist
(22)
π
Grand Slam
π
Trend Setter
β
The Questioner
(2)
β‘
Prolific Year
(7)
π
Conference Pioneer
ποΈ
Keyword Collector
(97)
π
Century Club
(104)
π₯
Unstoppable
(11)
Conferences
NIPS (27)
ICML (26)
ICLR (14)
AISTATS (8)
JMLR (7)
AAAI (4)
EMNLP (3)
IJCAI (3)
EACL (2)
CVPR (2)
ACL (2)
L4DC (2)
UAI (2)
ICCV (1)
COLT (1)
NAACL (1)
WACV (1)
Top co-authors
Research topics
Keywords
reinforcement learning
(14)
sample complexity
(13)
regret bound
(9)
markov decision process
(8)
stochastic optimization
(7)
policy gradient
(6)
markov chain
(6)
representation learning
(5)
large language model
(5)
off-policy evaluation
(5)
kernel methods
(4)
inference-time alignment
(4)
model-based reinforcement learning
(4)
linear function approximation
(3)
curse of dimensionality
(3)
stochastic gradient
(3)
off-policy learning
(3)
bayesian regret
(3)
dimensionality reduction
(3)
online learning
(3)
Papers
ChipSeek: Optimizing Verilog Generation via EDA-Integrated Reinforcement Learning
ACL 2026
Jailbreaks as Inference-Time Alignment: A Framework for Understanding Safety Failures in LLMs
EACL 2026
CycleSL: Server-Client Cyclical Update Driven Scalable Split Learning
WACV 2026
Deep Reinforcement Learning for Efficient and Fair Allocation of Healthcare Resources
IJCAI 2025
WenyanGPT: A Large Language Model for Classical Chinese Tasks
IJCAI 2025
Shallow Preference Signals: Large Language Model Aligns Even Better with Truncated Data?
ACL 2025
Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models
ICML 2025
A First-order Generative Bilevel Optimization Framework for Diffusion Models
ICML 2025
BaWA: Automatic Optimizing Pruning Metric for Large Language Models with Balanced Weight and Activation
ICML 2025
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow
ICLR 2025
Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias
ICLR 2025
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
ICLR 2025
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
EMNLP 2025
Temporal Consistency for LLM Reasoning Process Error Identification
EMNLP 2025
Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data
ICLR 2025
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment
ICLR 2025
Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
CVPR 2025
MATH-Perturb: Benchmarking LLMsβ Math Reasoning Abilities against Hard Perturbations
ICML 2025
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
ICLR 2025
Preacher: Paper-to-Video Agentic System
ICCV 2025
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
EMNLP 2025
MaxMin-RLHF: Alignment with Diverse Human Preferences
ICML 2024
Policy Evaluation for Reinforcement Learning from Human Feedback: A Sample Complexity Analysis
AISTATS 2024
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
JMLR 2024
Sample-Efficient Learning of POMDPs with Multiple Observations In Hindsight
ICLR 2024
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback
ICLR 2024
Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective
ICML 2024
Theoretical insights for diffusion guidance: A case study for Gaussian mixture models
ICML 2024
Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds
JMLR 2024
Global Convergence in Training Large-Scale Transformers
NIPS 2024
Fast Best-of-N Decoding via Speculative Rejection
NIPS 2024
FlexSBDD: Structure-Based Drug Design with Flexible Protein Modeling
NIPS 2024
Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks
NIPS 2024
Offline Multitask Representation Learning for Reinforcement Learning
NIPS 2024
One-Layer Transformer Provably Learns One-Nearest Neighbor In Context
NIPS 2024
Gradient Guidance for Diffusion Models: An Optimization Perspective
NIPS 2024
Transfer Q-star : Principled Decoding for LLM Alignment
NIPS 2024
A Theoretical Perspective for Speculative Decoding Algorithm
NIPS 2024
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
ICML 2024
Tree Search-Based Evolutionary Bandits for Protein Sequence Optimization
AAAI 2024
TurboSVM-FL: Boosting Federated Learning through SVM Aggregation for Lazy Clients
AAAI 2024
Visual Adversarial Examples Jailbreak Aligned Large Language Models
AAAI 2024
Information-Directed Pessimism for Offline Reinforcement Learning
ICML 2024
Theory of Consistency Diffusion Models: Distribution Estimation Meets Fast Sampling
ICML 2024
Provable Benefits of Representational Transfer in Reinforcement Learning
COLT 2023
Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation
NIPS 2023
Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective
NIPS 2023
Efficient RL with Impaired Observability: Learning to Act with Delayed and Missing State Observations
NIPS 2023
Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement
NIPS 2023
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
AISTATS 2023
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient
ICLR 2023
Learning Kernelized Contextual Bandits in a Distributed and Asynchronous Environment
ICLR 2023
Deep Reinforcement Learning for Cost-Effective Medical Diagnosis
ICLR 2023
Sample Complexity of Nonparametric Off-Policy Evaluation on Low-Dimensional Manifolds using Deep Networks
ICLR 2023
Representation Learning for Low-rank General-sum Markov Games
ICLR 2023
STEERING : Stein Information Directed Exploration for Model-Based Reinforcement Learning
ICML 2023
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data
ICML 2023
Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP
ICML 2023
Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories
ICML 2023
Learning Good State and Action Representations for Markov Decision Process via Tensor Decomposition
JMLR 2023
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
JMLR 2023
Optimal Estimation of Policy Gradient via Double Fitted Iteration
ICML 2022
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning approach
ICML 2022
Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory
ICML 2022
Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks
NIPS 2022
Communication Efficient Distributed Learning for Kernelized Contextual Bandits
NIPS 2022
Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization
NIPS 2022
Multi-Agent Reinforcement Learning with General Utilities via Decentralized Shadow Reward Actor-Critic
AAAI 2022
Offline stochastic shortest path: Learning, evaluation and towards optimality
UAI 2022
Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism
ICLR 2022
Parameter-Efficient Sparsity for Large Language Models Fine-Tuning
IJCAI 2022
Online Sparse Reinforcement Learning
AISTATS 2021
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient
ICML 2021
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
ICML 2021
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
NIPS 2021
Contrastive Multi-document Question Generation
EACL 2021
Towards Compact CNNs via Collaborative Compression
CVPR 2021
Generalization Bounds for Stochastic Saddle Point Problems
AISTATS 2021
High-Dimensional Sparse Linear Bandits
NIPS 2020
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
NIPS 2020
A Duality Approach for Regret Minimization in Average-Award Ergodic Markov Decision Processes
L4DC 2020
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
ICML 2020
Model-Based Reinforcement Learning with Value-Targeted Regression
ICML 2020
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
ICML 2020
Model-Based Reinforcement Learning with Value-Targeted Regression
L4DC 2020
Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity
AISTATS 2020
Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations
NIPS 2020
Generalized Leverage Score Sampling for Neural Networks
NIPS 2020
Sketching Transformed Matrices with Applications to Natural Language Processing
AISTATS 2020
Approximation Hardness for A Class of Sparse Optimization Problems
JMLR 2019
Sample-Optimal Parametric Q-Learning Using Linearly Additive Features
ICML 2019
Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python
JMLR 2019
Learning low-dimensional state embeddings and metastable clusters from time series data
NIPS 2019
State Aggregation Learning from Markov Transition Data
NIPS 2019
Towards Coherent and Cohesive Long-form Text Generation
NAACL 2019
Online Factorization and Partition of Complex Networks by Random Walk
UAI 2019
Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization
NIPS 2018
Scalable Bilinear Pi Learning Using State and Action Features
ICML 2018
Minimax-Optimal Privacy-Preserving Sparse PCA in Distributed Systems
AISTATS 2018
Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model
NIPS 2018
Estimation of Markov Chain via Rank-Constrained Likelihood
ICML 2018
Diffusion Approximations for Online Principal Component Estimation and Global Convergence
NIPS 2017
Finite-sum Composition Optimization via Variance Reduced Gradient Descent
AISTATS 2017
Strong NP-Hardness for Sparse Optimization with Concave Penalty Functions
ICML 2017
Accelerating Stochastic Composition Optimization
JMLR 2017
Accelerating Stochastic Composition Optimization
NIPS 2016