Chen-Yu Wei

43 papers · 2016–2025 · 6 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🏃 Academic Marathon (9) 🐝 Cross-Pollinator (10) 🌍 Conference Polyglot (6) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (26)

🗺️ Taxonomy Completionist (26) 🧭 Keyword Pioneer 🤝 Dynamic Duo (24) 👑 Triple Crown 🔬 Deep Specialist (21) 🗃️ Keyword Collector (119) 💎 Century Club (43) ⚡ Prolific Year (7) 🔥 Unstoppable (10) ❓ The Questioner (2)

Conferences

NIPS (14) COLT (13) ICML (8) ALT (4) AISTATS (2) ICLR (2)

Top co-authors

Haipeng Luo (24) Julian Zimmert (9) Chung-Wei Lee (7) Haolin Liu (5) Mengxiao Zhang (4) Weiqiang Zheng (3) Christoph Dann (3) Yang Cai (3) Yi-Te Hong (2) Chi-Jen Lu (2)

Research topics

Reinforcement Learning (1)

Keywords

regret bound (23) online learning (10) multi-armed bandit (8) contextual bandit (8) stochastic optimization (5) linear bandit (5) dynamic regret (5) online algorithm (4) adversarial learning (4) bandit feedback (4) markov decision process (4) minimax regret (4) markov game (3) non-stationary environment (3) policy optimization (3) adversarial mdp (3) nash equilibrium (3) linear function approximation (3) online mirror descent (3) multi-agent system (3)

Papers

Decision Making in Hybrid Environments: A Model Aggregation Approach COLT 2025 Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games AISTATS 2024 Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification NIPS 2024 How Does Variance Shape the Regret in Contextual Bandits? NIPS 2024 Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data COLT 2024 On Tractable $\Phi$-Equilibria in Non-Concave Games NIPS 2024 Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback ICLR 2024 Beating Adversarial Low-Rank MDPs with Unknown Transition and Bandit Feedback NIPS 2024 Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback NIPS 2023 No-Regret Online Reinforcement Learning with Adversarial Losses and Transitions NIPS 2023 Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits NIPS 2023 First- and Second-Order Bounds for Adversarial Linear Contextual Bandits NIPS 2023 Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs NIPS 2023 A Unified Algorithm for Stochastic Path Problems ALT 2023 A Blackbox Approach to Best of Both Worlds in Bandits and Beyond COLT 2023 Refined Regret for Adversarial MDPs with Linear Function Approximation ICML 2023 Best of Both Worlds Policy Optimization ICML 2023 Decentralized Cooperative Reinforcement Learning with Hierarchical Information Structure ALT 2022 A Model Selection Approach for Corruption Robust Reinforcement Learning ALT 2022 Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning ICML 2022 Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence ICML 2022 Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously ICML 2021 Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation AISTATS 2021 Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition COLT 2021 Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications COLT 2021 Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games COLT 2021 Non-stationary Reinforcement Learning without Prior Knowledge: an Optimal Black-box Approach COLT 2021 Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds ALT 2021 Linear Last-iterate Convergence in Constrained Saddle-point Optimization ICLR 2021 Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses NIPS 2021 Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes ICML 2020 Taking a hint: How to leverage loss predictors in contextual bandits? COLT 2020 Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs NIPS 2020 Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously ICML 2019 Bandit Multiclass Linear Classification: Efficient Algorithms for the Separable Case ICML 2019 Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information COLT 2019 Improved Path-length Regret Bounds for Bandits COLT 2019 A New Algorithm for Non-stationary Contextual Bandits: Efficient, Optimal and Parameter-free COLT 2019 More Adaptive Algorithms for Adversarial Bandits COLT 2018 Efficient Online Portfolio with Logarithmic Regret NIPS 2018 Efficient Contextual Bandits in Non-stationary Worlds COLT 2018 Online Reinforcement Learning in Stochastic Games NIPS 2017 Tracking the Best Expert in Non-stationary Stochastic Environments NIPS 2016