Xiaoteng Ma
18 papers · 2021–2026 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+8 more ↓ Show less ↑
🌍 Conference Polyglot (5) 🐝 Cross-Pollinator (8) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge
🌉
Interdisciplinary Bridge
🏆
Keyword Champion
(3)
👑
Triple Crown
🏆
Grand Slam
🔥
Unstoppable
(5)
💎
Century Club
(17)
⚡
Prolific Year
(5)
❓
The Questioner
Conferences
NIPS (6)
ICLR (5)
AAAI (2)
ICML (2)
IJCAI (2)
ACL (1)
Top co-authors
Keywords
offline reinforcement learning
(4)
value estimation
(3)
reinforcement learning
(3)
continuous control
(2)
deep reinforcement learning
(2)
distribution shift
(2)
value function
(2)
policy optimization
(2)
ood generalization
(1)
competitive game
(1)
domain generalization
(1)
policy transfer
(1)
extrapolation error
(1)
world model
(1)
trust region
(1)
reward shaping
(1)
policy gradient
(1)
out-of-distribution action
(1)
adversarial robustness
(1)
sample efficiency
(1)
Papers
From Word to World: Can Large Language Models be Implicit Text-based World Models?
ACL 2026
Cross-Domain Offline Policy Adaptation with Optimal Transport and Dataset Constraint
ICLR 2025
Episodic Novelty Through Temporal Distance
ICLR 2025
Efficient Multi-agent Reinforcement Learning by Planning
ICLR 2024
NeuralPlane: An Efficiently Parallelizable Platform for Fixed-wing Aircraft Control with Reinforcement Learning
NIPS 2024
Learning Diverse Risk Preferences in Population-Based Self-Play
AAAI 2024
SEABO: A Simple Search-Based Method for Offline Imitation Learning
ICLR 2024
Single-Trajectory Distributionally Robust Reinforcement Learning
ICML 2024
Cross-Domain Policy Adaptation via Value-Guided Data Filtering
NIPS 2023
Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning (Extended Abstract)
IJCAI 2023
What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?
ICML 2023
Offline Reinforcement Learning with Value-based Episodic Memory
ICLR 2022
Exploit Reward Shifting in Value-Based Deep-RL: Optimistic Curiosity-Based Exploration and Conservative Exploitation via Linear Reward Shaping
NIPS 2022
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing
NIPS 2022
Mildly Conservative Q-Learning for Offline Reinforcement Learning
NIPS 2022
Efficient Continuous Control with Double Actors and Regularized Critics
AAAI 2022
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning
NIPS 2021
Average-Reward Reinforcement Learning with Trust Region Methods
IJCAI 2021