conftrace_

Bilal Piot

30 papers · 2012–2025 · 5 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+14 more ↓ 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird πŸ—ΊοΈ Taxonomy Completionist (11) πŸŒ‰ Interdisciplinary Bridge 🌍 Conference Polyglot (5)
πŸŒ‰ Interdisciplinary Bridge 🌍 Conference Polyglot (5) 🌈 Renaissance Researcher (6) 🀝 Dynamic Duo (12) πŸ‘‘ Triple Crown 🧬 Topic Evolution πŸ† Keyword Champion πŸ“ˆ Trend Setter πŸš€ Conference Pioneer πŸ”₯ Unstoppable (7) ⚑ Prolific Year (6) ❓ The Questioner πŸ—ƒοΈ Keyword Collector (93) πŸ’Ž Century Club (30)

Conferences

ICML (9) ICLR (8) NIPS (7) AISTATS (4) IJCAI (2)

Papers

RRM: Robust Reward Model Training Mitigates Reward Hacking ICLR 2025 Building Math Agents with Multi-Turn Iterative Preference Learning ICLR 2025 Learning from negative feedback, or positive feedback or both ICLR 2025 Multi-turn Reinforcement Learning with Preference Human Feedback NIPS 2024 Nash Learning from Human Feedback ICML 2024 Unlocking the Power of Representations in Long-term Novelty-based Exploration ICLR 2024 Generalized Preference Optimization: A Unified Approach to Offline Alignment ICML 2024 A General Theoretical Paradigm to Understand Learning from Human Preferences AISTATS 2024 Human Alignment of Large Language Models through Online Preference Optimisation ICML 2024 Understanding Self-Predictive Learning for Reinforcement Learning ICML 2023 The Edge of Orthogonality: A Simple View of What Makes BYOL Tick ICML 2023 BYOL-Explore: Exploration by Bootstrapped Prediction NIPS 2022 Emergent Communication at Scale ICLR 2022 Agent57: Outperforming the Atari Human Benchmark ICML 2020 Never Give Up: Learning Directed Exploration Strategies ICLR 2020 Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning NIPS 2020 Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning ICML 2020 Hindsight Credit Assignment NIPS 2019 The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning ICLR 2018 Actor-Critic Fictitious Play in Simultaneous Move Multistage Games AISTATS 2018 Noisy Networks For Exploration ICLR 2018 End-to-end optimization of goal-driven and visually grounded dialogue systems IJCAI 2017 Is the Bellman residual a bad proxy? NIPS 2017 Learning Nash Equilibrium for General-Sum Markov Games from Batch Data AISTATS 2017 Softened Approximate Policy Iteration for Markov Games ICML 2016 On the Use of Non-Stationary Strategies for Solving Two-Player Zero-Sum Markov Games AISTATS 2016 Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games ICML 2015 Inverse Reinforcement Learning in Relational Domains IJCAI 2015 Difference of Convex Functions Programming for Reinforcement Learning NIPS 2014 Inverse Reinforcement Learning through Structured Classification NIPS 2012