Reinforcement Learning › Methods ›

Deep RL

3861 directly classified papers

Papers per year

Papers

Bit_numeval at SemEval-2024 Task 7: Enhance Numerical Sensitivity and Reasoning Completeness for Quantitative Understanding SEMEVAL 2024

CrystalBox: Future-Based Explanations for Input-Driven Deep RL Systems AAAI 2024

Exploring Gradient Explosion in Generative Adversarial Imitation Learning: A Probabilistic Perspective AAAI 2024

VLMPC: Vision-Language Model Predictive Control for Robotic Manipulation RSS 2024

EXPLORER: Exploration-guided Reasoning for Textual Reinforcement Learning EACL 2024

Optimal Attack and Defense for Reinforcement Learning AAAI 2024

Improve Robustness of Reinforcement Learning against Observation Perturbations via l∞ Lipschitz Policy Networks AAAI 2024

Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning AAAI 2024

OCEAN-MBRL: Offline Conservative Exploration for Model-Based Offline Reinforcement Learning AAAI 2024

Finding a Needle in the Adversarial Haystack: A Targeted Paraphrasing Approach For Uncovering Edge Cases with Minimal Distribution Distortion EACL 2024

Open Problem: Order Optimal Regret Bounds for Kernel-Based Reinforcement Learning COLT 2024

ERL-TD: Evolutionary Reinforcement Learning Enhanced with Truncated Variance and Distillation Mutation AAAI 2024

Episodic Return Decomposition by Difference of Implicitly Assigned Sub-trajectory Reward AAAI 2024

Settling the sample complexity of online reinforcement learning COLT 2024

Scale-free Adversarial Reinforcement Learning COLT 2024

Exploration via linearly perturbed loss minimisation AISTATS 2024

OVD-Explorer: Optimism Should Not Be the Sole Pursuit of Exploration in Noisy Environments AAAI 2024

Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement EMNLP 2024

Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical Systems AISTATS 2024

Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation AISTATS 2024

Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs COLT 2024

Linear Bellman Completeness Suffices for Efficient Online Reinforcement Learning with Few Actions COLT 2024

On learning history-based policies for controlling Markov decision processes AISTATS 2024

Model-based Policy Optimization under Approximate Bayesian Inference AISTATS 2024

Learning Uncertainty-Aware Temporally-Extended Actions AAAI 2024