conftrace_

Tengyang Xie

21 papers · 2018–2025 · 6 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+10 more ↓ 🐝 Cross-Pollinator (7) 🏃 Academic Marathon (7) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6) 🌈 Renaissance Researcher (6)
🌍 Conference Polyglot (6) 🏃 Academic Marathon (7) 🌈 Renaissance Researcher (6) 🤝 Dynamic Duo (10) 🏆 Keyword Champion 💎 Century Club (21) The Questioner Prolific Year (5) 🗃️ Keyword Collector (60) 🔥 Unstoppable (8)

Conferences

NIPS (7) ICML (6) ICLR (5) ACL (1) EMNLP (1) UAI (1)

Papers

Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization ICLR 2025 Reinforce LLM Reasoning through Multi-Agent Reflection ICML 2025 Do We Need to Verify Step by Step? Rethinking Process Supervision from a Theoretical Perspective ICML 2025 Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF ICLR 2025 Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data ICML 2024 CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples ACL 2024 Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts EMNLP 2024 Harnessing Density Ratios for Online Reinforcement Learning ICLR 2024 Towards Principled Representation Learning from Videos for Reinforcement Learning ICLR 2024 Adversarial Model for Offline Reinforcement Learning NIPS 2023 The Role of Coverage in Online Reinforcement Learning ICLR 2023 Interaction-Grounded Learning with Action-Inclusive Feedback NIPS 2022 Adversarially Trained Actor Critic for Offline Reinforcement Learning ICML 2022 Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning NIPS 2021 Interaction-Grounded Learning ICML 2021 Bellman-consistent Pessimism for Offline Reinforcement Learning NIPS 2021 Batch Value-function Approximation with Only Realizability ICML 2021 Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison UAI 2020 Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling NIPS 2019 Provably Efficient Q-Learning with Low Switching Cost NIPS 2019 A Block Coordinate Ascent Algorithm for Mean-Variance Optimization NIPS 2018