Jiaming Ji
31 papers · 2022–2026 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π Conference Polyglot (6) π Cross-Pollinator (13) π Interdisciplinary Bridge π§ Keyword Pioneer π Renaissance Researcher (8)
π
Renaissance Researcher
(8)
πΊοΈ
Taxonomy Completionist
(49)
π¬
Deep Specialist
(11)
π€
Dynamic Duo
(19)
π
Grand Slam
β‘
Prolific Year
(7)
π₯
Unstoppable
(5)
ποΈ
Keyword Collector
(114)
π
Century Club
(24)
β
The Questioner
Conferences
ACL (15)
NIPS (7)
AAAI (4)
ICLR (2)
JMLR (2)
ICML (1)
Top co-authors
Keywords
large language model
(11)
reinforcement learning from human feedback
(6)
safe reinforcement learning
(5)
constraint satisfaction
(4)
policy optimization
(4)
human preference
(3)
safety alignment
(3)
multimodal large language model
(3)
human preference alignment
(2)
preference learning
(2)
value alignment
(2)
benchmark evaluation
(2)
nash equilibrium
(2)
reward model
(2)
responsible ai
(2)
reward modeling
(2)
preference optimization
(2)
ai safety
(2)
language model alignment
(2)
helpful assistant
(2)
Papers
Benchmarking Fine-Grained Error Detection in Multimodal Reasoning
ACL 2026
AgentGym2: Benchmarking Large Language Model Agents in De-Idealized Real-World Environments
ACL 2026
SafeMT: Multi-turn Safety for Multimodal Language Models
ACL 2026
Omni-RewardBench: Toward a Comprehensive Evaluation of Generative Reward Models Across Modalities
ACL 2026
SafeMCP: Proactive Power Regulation for LLM Agent Defense via Environment-Grounded Look-Ahead Reasoning
ACL 2026
A Game-Theoretica Negotiation Framework for Cross-Cultural Consensus
ACL 2026
What, Whether and How? Unveiling Process Reward Models for Thinking with Images Reasoning
AAAI 2026
Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA
ACL 2025
A Survey of LLM-based Agents in Medicine: How far are we from Baymax?
ACL 2025
Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction
AAAI 2025
Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback
AAAI 2025
LegalReasoner: Step-wised Verification-Correction for Legal Judgment Reasoning
ACL 2025
Language Models Resist Alignment: Evidence From Data Compression
ACL 2025
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation
ACL 2025
PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference
ACL 2025
SafeLawBench: Towards Safe Alignment of Large Language Models
ACL 2025
Reward Generalization in RLHF: A Topological Perspective
ACL 2025
Benchmarking Multi-National Value Alignment for Large Language Models
ACL 2025
SAE-V: Interpreting Multimodal Models for Enhanced Alignment
ICML 2025
Safe RLHF: Safe Reinforcement Learning from Human Feedback
ICLR 2024
SafeDreamer: Safe Reinforcement Learning with World Models
ICLR 2024
Heterogeneous-Agent Reinforcement Learning
JMLR 2024
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
JMLR 2024
Aligner: Efficient Alignment by Learning to Correct
NIPS 2024
SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset
NIPS 2024
ProgressGym: Alignment with a Millennium of Moral Progress
NIPS 2024
Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark
NIPS 2023
VOCE: Variational Optimization with Conservative Estimation for Offline Safe Reinforcement Learning
NIPS 2023
Augmented Proximal Policy Optimization for Safe Reinforcement Learning
AAAI 2023
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
NIPS 2023
Constrained Update Projection Approach to Safe Policy Optimization
NIPS 2022