Nadav Merlis

17 papers · 2018–2026 · 6 conferences · across top CS/AI conferences

Achievements

+8 more ↓

🌍 Conference Polyglot (6) 🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (7)

🐝 Cross-Pollinator (5) 🌈 Renaissance Researcher (5) 🗺️ Taxonomy Completionist (30) 🏆 Grand Slam 🧬 Topic Evolution 💎 Century Club (16) 🗃️ Keyword Collector (69) 🔥 Unstoppable (8)

Conferences

NIPS (6) ICML (4) AAAI (3) COLT (2) AISTATS (1) ICLR (1)

Top co-authors

Shie Mannor (9) Vianney Perchet (4) Yonathan Efroni (3) Dorian Baudry (2) Hugo Richard (2) Lior Shani (2) Guy Tennenholtz (2) Daniel J Mankowitz (1) Aadirupa Saha (1) Corentin Odic (1)

Keywords

regret bound (12) online learning (7) reinforcement learning (5) multi-armed bandit (4) markov decision process (3) thompson sampling (3) contextual bandit (3) stochastic optimization (2) deep q-network (2) online algorithm (2) regret analysis (2) model-based reinforcement learning (2) constraint satisfaction (1) dynamic programming (1) exploration-exploitation tradeoff (1) offline reinforcement learning (1) deep reinforcement learning (1) policy learning (1) autonomous driving (1) greedy policy (1)

Papers

Online Linear Regression with Paid Stochastic Features AAAI 2026 On Bits and Bandits: Quantifying the Regret-Information Trade-off ICLR 2025 The Value of Reward Lookahead in Reinforcement Learning NIPS 2024 Improved Algorithms for Contextual Dynamic Pricing NIPS 2024 Multi-armed bandits with guaranteed revenue per arm AISTATS 2024 Reinforcement Learning with Lookahead Information NIPS 2024 On Preemption and Learning in Stochastic Scheduling ICML 2023 Reinforcement Learning with History Dependent Dynamic Contexts ICML 2023 Reinforcement Learning with a Terminator NIPS 2022 Reinforcement Learning with Trajectory Feedback AAAI 2021 Lenient Regret for Multi-Armed Bandits AAAI 2021 Confidence-Budget Matching for Sequential Budgeted Learning ICML 2021 Ensemble Bootstrapping for Q-Learning ICML 2021 Tight Lower Bounds for Combinatorial Multi-Armed Bandits COLT 2020 Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies NIPS 2019 Batch-Size Independent Regret Bounds for the Combinatorial Multi-Armed Bandit Problem COLT 2019 Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning NIPS 2018