Pinjia He
14 papers · 2021–2026 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+7 more ↓ Show less ↑
π Cross-Pollinator (13) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (5) π Academic Marathon (5)
π
Conference Polyglot
(5)
π
Cross-Pollinator
(13)
π
Keyword Champion
(2)
ποΈ
Keyword Collector
(55)
β‘
Prolific Year
(6)
π
Century Club
(11)
β
The Questioner
(2)
Conferences
ACL (6)
EMNLP (3)
ICLR (2)
AAAI (1)
COLING (1)
OSDI (1)
Top co-authors
Keywords
large language model
(7)
benchmark evaluation
(4)
multimodal large language model
(3)
visual question answering
(2)
software testing
(2)
prompt engineering
(1)
in-context learning
(1)
logical reasoning
(1)
model safety
(1)
code generation
(1)
ai safety
(1)
automated reasoning
(1)
harmful content
(1)
model alignment
(1)
adversarial defense
(1)
multimodal learning
(1)
software engineering
(1)
reward hacking
(1)
static analysis
(1)
commonsense knowledge
(1)
Papers
SHAPE: Unifying Safety, Helpfulness and Pedagogy for Educational LLMs
ACL 2026
MicLog: Towards Accurate and Efficient LLM-based Log Parsing via Progressive Meta In-Context Learning
AAAI 2026
Curing Miracle Steps in LLM Mathematical Reasoning with Rubric Rewards
ACL 2026
Insight Over Sight: Exploring the Vision-Knowledge Conflicts in Multimodal LLMs
ACL 2025
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
ACL 2025
ToolSafety: A Comprehensive Dataset for Enhancing Safety in LLM-Based Agent Tool Invocations
EMNLP 2025
UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench
ACL 2025
Canβt See the Forest for the Trees: Benchmarking Multimodal Safety Awareness for Multimodal LLMs
ACL 2025
OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?
ICLR 2025
Does ChatGPT Know That It Does Not Know? Evaluating the Black-Box Calibration of ChatGPT
COLING 2024
Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs
EMNLP 2024
LogicAsker: Evaluating and Improving the Logical Reasoning Ability of Large Language Models
EMNLP 2024
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
ICLR 2024
SANRAZOR: Reducing Redundant Sanitizer Checks in C/C++ Programs
OSDI 2021