Jiaming Ji

31 papers · 2022–2026 · 6 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🌍 Conference Polyglot (6) 🐝 Cross-Pollinator (13) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌈 Renaissance Researcher (8)

🌈 Renaissance Researcher (8) 🗺️ Taxonomy Completionist (49) 🔬 Deep Specialist (11) 🤝 Dynamic Duo (19) 🏆 Grand Slam ⚡ Prolific Year (7) 🔥 Unstoppable (5) 🗃️ Keyword Collector (114) 💎 Century Club (24) ❓ The Questioner

Conferences

ACL (15) NIPS (7) AAAI (4) ICLR (2) JMLR (2) ICML (1)

Top co-authors

Yaodong Yang (22) Juntao Dai (11) Sirui Han (10) Yike Guo (9) Josef Dai (8) Xuehai Pan (6) Jiayi Zhou (6) Chi-Min Chan (5) Han Zhu (5) Boyuan Chen (5)

Keywords

large language model (11) reinforcement learning from human feedback (6) safe reinforcement learning (5) constraint satisfaction (4) policy optimization (4) human preference (3) safety alignment (3) multimodal large language model (3) human preference alignment (2) preference learning (2) value alignment (2) benchmark evaluation (2) nash equilibrium (2) reward model (2) responsible ai (2) reward modeling (2) preference optimization (2) ai safety (2) language model alignment (2) helpful assistant (2)

Papers

Benchmarking Fine-Grained Error Detection in Multimodal Reasoning ACL 2026 AgentGym2: Benchmarking Large Language Model Agents in De-Idealized Real-World Environments ACL 2026 SafeMT: Multi-turn Safety for Multimodal Language Models ACL 2026 Omni-RewardBench: Toward a Comprehensive Evaluation of Generative Reward Models Across Modalities ACL 2026 SafeMCP: Proactive Power Regulation for LLM Agent Defense via Environment-Grounded Look-Ahead Reasoning ACL 2026 A Game-Theoretica Negotiation Framework for Cross-Cultural Consensus ACL 2026 What, Whether and How? Unveiling Process Reward Models for Thinking with Images Reasoning AAAI 2026 Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA ACL 2025 A Survey of LLM-based Agents in Medicine: How far are we from Baymax? ACL 2025 Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction AAAI 2025 Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback AAAI 2025 LegalReasoner: Step-wised Verification-Correction for Legal Judgment Reasoning ACL 2025 Language Models Resist Alignment: Evidence From Data Compression ACL 2025 FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation ACL 2025 PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference ACL 2025 SafeLawBench: Towards Safe Alignment of Large Language Models ACL 2025 Reward Generalization in RLHF: A Topological Perspective ACL 2025 Benchmarking Multi-National Value Alignment for Large Language Models ACL 2025 SAE-V: Interpreting Multimodal Models for Enhanced Alignment ICML 2025 Safe RLHF: Safe Reinforcement Learning from Human Feedback ICLR 2024 SafeDreamer: Safe Reinforcement Learning with World Models ICLR 2024 Heterogeneous-Agent Reinforcement Learning JMLR 2024 OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research JMLR 2024 Aligner: Efficient Alignment by Learning to Correct NIPS 2024 SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset NIPS 2024 ProgressGym: Alignment with a Millennium of Moral Progress NIPS 2024 Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark NIPS 2023 VOCE: Variational Optimization with Conservative Estimation for Offline Safe Reinforcement Learning NIPS 2023 Augmented Proximal Policy Optimization for Safe Reinforcement Learning AAAI 2023 BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset NIPS 2023 Constrained Update Projection Approach to Safe Policy Optimization NIPS 2022