Juntao Dai
16 papers · 2022–2026 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+7 more ↓ Show less ↑
π Cross-Pollinator (13) π Interdisciplinary Bridge π Conference Polyglot (6) π§ Keyword Pioneer π Renaissance Researcher (6)
πΊοΈ
Taxonomy Completionist
(28)
π£
Hot Topic Early Bird
π
Grand Slam
β
The Questioner
π
Century Club
(10)
ποΈ
Keyword Collector
(53)
π₯
Unstoppable
(5)
Conferences
ACL (6)
NIPS (3)
AAAI (2)
EMNLP (2)
ICLR (1)
ICML (1)
JMLR (1)
Top co-authors
Keywords
large language model
(4)
safe reinforcement learning
(3)
constraint satisfaction
(3)
policy optimization
(3)
multimodal large language model
(2)
human preference alignment
(2)
preference learning
(1)
theorem proving
(1)
transfer learning
(1)
preference optimization
(1)
policy learning
(1)
visual reasoning
(1)
dialogue safety
(1)
ai safety
(1)
chain-of-thought reasoning
(1)
mathematical reasoning
(1)
formal verification
(1)
reinforcement learning from human feedback
(1)
error detection
(1)
question answering
(1)
Papers
A Game-Theoretica Negotiation Framework for Cross-Cultural Consensus
ACL 2026
SafeMCP: Proactive Power Regulation for LLM Agent Defense via Environment-Grounded Look-Ahead Reasoning
ACL 2026
Omni-RewardBench: Toward a Comprehensive Evaluation of Generative Reward Models Across Modalities
ACL 2026
SafeMT: Multi-turn Safety for Multimodal Language Models
ACL 2026
Perception, Understanding and Reasoning: A Multimodal Benchmark for Video Fake News Detection
ACL 2026
Benchmarking Fine-Grained Error Detection in Multimodal Reasoning
ACL 2026
What, Whether and How? Unveiling Process Reward Models for Thinking with Images Reasoning
AAAI 2026
Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving
EMNLP 2025
Automate Strategy Finding with LLM in Quant Investment
EMNLP 2025
Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization
ICLR 2025
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
JMLR 2024
SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset
NIPS 2024
Aligner: Efficient Alignment by Learning to Correct
NIPS 2024
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation
ICML 2024
Augmented Proximal Policy Optimization for Safe Reinforcement Learning
AAAI 2023
Constrained Update Projection Approach to Safe Policy Optimization
NIPS 2022