conftrace_

Dale Schuurmans

124 papers · 2002–2025 · 15 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+19 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (43) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7) 🐣 Hot Topic Early Bird

🗺️ Taxonomy Completionist (43) 🐣 Hot Topic Early Bird 🏃 Academic Marathon (23) 🏠 Conference Loyalist (45) 🌟 Keyword Trendsetter Combo (9) 📛 The Namer 🏆 Keyword Champion 👑 Triple Crown 🌱 Topic Pioneer 🔬 Deep Specialist (17) 🤝 Dynamic Duo (39) 🏆 Grand Slam 🗃️ Keyword Collector (204) ❓ The Questioner 📈 Trend Setter 🚀 Conference Pioneer 🔥 Unstoppable (14) ⚡ Prolific Year (14) 💎 Century Club (124)

Conferences

NIPS (45) ICML (29) ICLR (21) AISTATS (11) IJCAI (5) ACL (2) COLING (2) CONLL (2) AAAI (1) ACML (1) EACL (1) ICCV (1) JMLR (1) NAACL (1) UAI (1)

Top co-authors

Bo Dai (39) Hanjun Dai (21) Jincheng Mei (16) Csaba Szepesvári (15) Chenjun Xiao (12) Ofir Nachum (9) Yaoliang Yu (9) Sherry Yang (9) Xinhua Zhang (7) Pieter Abbeel (7)

Research topics

Reinforcement Learning (1) Applications (1)

Keywords

convex optimization (10) reinforcement learning (9) neural network (6) representation learning (5) variational inference (5) policy learning (5) stochastic optimization (4) markov chain monte carlo (4) function approximation (4) policy optimization (4) value iteration (4) stationary distribution (4) markov decision process (4) sample efficiency (4) policy gradient (4) value function (4) off-policy evaluation (4) generative model (4) neural network optimization (3) deep learning (3)

Papers

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training ICML 2025 Plastic Learning with Deep Fourier Features ICLR 2025 Toward Understanding In-context vs. In-weight Learning ICLR 2025 Faster WIND: Accelerating Iterative Best-of-$N$ Distillation for LLM Alignment AISTATS 2025 Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF ICLR 2025 Improving Large Language Model Planning with Action Sequence Similarity ICLR 2025 Learning Continually by Spectral Regularization ICLR 2025 Position: Video as the New Language for Real-World Decision Making ICML 2024 UQE: A Query Engine for Unstructured Databases NIPS 2024 Generative Hierarchical Materials Search NIPS 2024 Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates NIPS 2024 Learning Interactive Real-World Simulators ICLR 2024 Scalable Diffusion for Materials Generation ICLR 2024 Probabilistic Adaptation of Black-Box Text-to-Video Models ICLR 2024 Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning ICML 2024 Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation ICML 2024 Energy-based Predictive Representations for Partially Observed Reinforcement Learning UAI 2023 Any-scale Balanced Samplers for Discrete Space ICLR 2023 Latent Variable Representation for Reinforcement Learning ICLR 2023 Revisiting Sampling for Combinatorial Optimization ICML 2023 TEMPERA: Test-Time Prompt Editing via Reinforcement Learning ICLR 2023 Dichotomy of Control: Separating What You Can Control from What You Cannot ICLR 2023 What learning algorithm is in-context learning? Investigations with linear models ICLR 2023 Least-to-Most Prompting Enables Complex Reasoning in Large Language Models ICLR 2023 Self-Consistency Improves Chain of Thought Reasoning in Language Models ICLR 2023 DISCS: A Benchmark for Discrete Sampling NIPS 2023 Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off NIPS 2023 Ordering-based Conditions for Global Convergence of Policy Gradient Methods NIPS 2023 Learning Universal Policies via Text-Guided Video Generation NIPS 2023 Discrete Langevin Samplers via Wasserstein Gradient Flow AISTATS 2023 Learning to Optimize with Stochastic Dominance Constraints AISTATS 2023 Gradient-Free Structured Pruning with Unlabeled Data ICML 2023 Stochastic Gradient Succeeds for Bandits ICML 2023 Spectral Decomposition Representation for Reinforcement Learning ICLR 2023 Score-based Continuous-time Discrete Diffusion Models ICLR 2023 Offline Policy Selection under Uncertainty AISTATS 2022 Neural Stochastic Dual Dynamic Programming ICLR 2022 Understanding and Leveraging Overparameterization in Recursive Value Estimation ICLR 2022 Making Linear MDPs Practical via Contrastive Representation Learning ICML 2022 A Parametric Class of Approximate Gradient Updates for Policy Optimization ICML 2022 On the Global Convergence Rates of Decentralized Softmax Gradient Play in Markov Potential Games NIPS 2022 The Role of Baselines in Policy Gradient Optimization NIPS 2022 Optimal Scaling for Locally Balanced Proposals in Discrete Spaces NIPS 2022 Chain-of-Thought Prompting Elicits Reasoning in Large Language Models NIPS 2022 Chain of Thought Imitation with Procedure Cloning NIPS 2022 A Simple Decentralized Cross-Entropy Method NIPS 2022 Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization ICML 2022 The Curse of Passive Data Collection in Batch Reinforcement Learning AISTATS 2022 Characterizing the Gap Between Actor-Critic and Policy Gradient ICML 2021 LEGO: Latent Execution-Guided Reasoning for Multi-Hop Question Answering on Knowledge Graphs ICML 2021 Leveraging Non-uniformity in First-order Non-convex Optimization ICML 2021 Deep Probabilistic Canonical Correlation Analysis AAAI 2021 Combiner: Full Attention Transformer with Sparse Computation Cost NIPS 2021 Understanding the Effect of Stochasticity in Policy Optimization NIPS 2021 EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL ICML 2021 On the Optimality of Batch Policy Optimization Algorithms ICML 2021 GenDICE: Generalized Offline Estimation of Stationary Values ICLR 2020 An Optimistic Perspective on Offline Reinforcement Learning ICML 2020 Off-Policy Evaluation via the Regularized Lagrangian NIPS 2020 CoinDICE: Off-Policy Confidence Interval Estimation NIPS 2020 Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration NIPS 2020 A Maximum-Entropy Approach to Off-Policy Evaluation in Average-Reward MDPs NIPS 2020 Escaping the Gravitational Pull of Softmax NIPS 2020 Go Wide, Then Narrow: Efficient Training of Deep Thin Networks ICML 2020 Energy-Based Processes for Exchangeable Data ICML 2020 Domain Aggregation Networks for Multi-Source Domain Adaptation ICML 2020 Batch Stationary Distribution Estimation ICML 2020 ConQUR: Mitigating Delusional Bias in Deep Q-Learning ICML 2020 On the Global Convergence Rates of Softmax Policy Gradient Methods ICML 2020 Scalable Deep Generative Modeling for Sparse Graphs ICML 2020 Understanding the Impact of Entropy on Policy Optimization ICML 2019 The Value Function Polytope in Reinforcement Learning ICML 2019 Kernel Exponential Family Estimation via Doubly Dual Embedding AISTATS 2019 On Principled Entropy Exploration in Policy Optimization IJCAI 2019 Invertible Convolutional Flow NIPS 2019 Surrogate Objectives for Batch Policy Optimization in One-step Decision Making NIPS 2019 Maximum Entropy Monte-Carlo Planning NIPS 2019 Exponential Family Estimation via Adversarial Dynamics Embedding NIPS 2019 A Geometric Perspective on Optimal Representations for Reinforcement Learning NIPS 2019 Advantage Amplification in Slowly Evolving Latent-State Environments IJCAI 2019 Learning to Generalize from Sparse and Underspecified Rewards ICML 2019 Trust-PCL: An Off-Policy Trust Region Method for Continuous Control ICLR 2018 Variational Rejection Sampling AISTATS 2018 Non-delusional Q-learning and value-iteration NIPS 2018 Planning and Learning with Stochastic Action Sets IJCAI 2018 Smoothed Action Value Functions for Learning Gaussian Policies ICML 2018 Multi-view Matrix Factorization for Linear Dynamical System Estimation NIPS 2017 Generalized Conditional Gradient for Sparse Estimation JMLR 2017 Bridging the Gap Between Value and Policy Based Reinforcement Learning NIPS 2017 Logistic Markov Decision Processes IJCAI 2017 Reward Augmented Maximum Likelihood for Neural Structured Prediction NIPS 2016 Deep Learning Games NIPS 2016 Stochastic Neural Networks with Monotonic Activation Functions AISTATS 2016 Scalable and Sound Low-Rank Tensor Learning AISTATS 2016 Variance Reduction via Antithetic Markov Chains AISTATS 2015 Correcting Covariate Shift with the Frank-Wolfe Algorithm IJCAI 2015 Semi-Supervised Zero-Shot Classification With Label Representation Learning ICCV 2015 Embedding Inference for Structured Multilabel Prediction NIPS 2015 Adaptive Monte Carlo via Bandit Allocation ICML 2014 Convex Deep Learning via Normalized Kernels NIPS 2014 Characterizing the Representer Theorem ICML 2013 Learning a Metric Space for Neighbourhood Topology Estimation: Application to Manifold Learning ACML 2013 Polar Operators for Structured Sparse Estimation NIPS 2013 Convex Two-Layer Modeling NIPS 2013 A Polynomial-time Form of Robust Regression NIPS 2012 Generalized Optimal Reverse Prediction AISTATS 2012 Convex Multi-view Subspace Learning NIPS 2012 Accelerated Training for Matrix-norm Regularization: A Boosting Approach NIPS 2012 Relaxed Clipping: A Global Training Method for Robust Regression and Classification NIPS 2010 Improved Natural Language Learning via Variance-Regularization Support Vector Machines CONLL 2010 Convex Relaxation of Mixture Regression with Efficient Algorithms NIPS 2009 A General Projection Property for Distribution Families NIPS 2009 Semi-Supervised Convex Training for Dependency Parsing ACL 2008 Convex Relaxations of Latent Variable Training NIPS 2007 Stable Dual Dynamic Programming NIPS 2007 Discriminative Batch Mode Active Learning NIPS 2007 Improved Large Margin Dependency Parsing via Local Constraints and Laplacian Regularization CONLL 2006 Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling ACL 2006 Learning to Model Spatial Dependency: Semi-Supervised Discriminative Random Fields NIPS 2006 implicit Online Learning with Kernels NIPS 2006 Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling COLING 2006 Language and Task Independent Text Categorization with Simple Language Models NAACL 2003 Language Independent Authorship Attribution with Character Level N-Grams EACL 2003 Investigating the Relationship between Word Segmentation Performance and Retrieval Performance in Chinese IR COLING 2002