Minghe Gao
9 papers · 2023–2026 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+2 more ↓ Show less ↑
🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (15) 🌍 Conference Polyglot (4) 🧭 Keyword Pioneer 👥 Mega-Team (32)
❓
The Questioner
⚡
Prolific Year
(6)
Conferences
ICCV (3)
ICML (3)
ACL (1)
CVPR (1)
ICLR (1)
Top co-authors
Keywords
few-shot learning
(1)
domain generalization
(1)
chain-of-thought reasoning
(1)
multimodal learning
(1)
scene graph
(1)
evaluation framework
(1)
vision-language model
(1)
multimodal large language model
(1)
prompt tuning
(1)
gui automation
(1)
real-world environment
(1)
language agent
(1)
spatio-temporal reasoning
(1)
video large language model
(1)
visual agent
(1)
graphical user interface automation
(1)
adaptive cropping
(1)
self-refining learning
(1)
large language model
(1)
visual program
(1)
Papers
AgentGym2: Benchmarking Large Language Model Agents in De-Idealized Real-World Environments
ACL 2026
Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
ICCV 2025
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
ICCV 2025
STEP: Enhancing Video-LLMs' Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training
CVPR 2025
On Path to Multimodal Generalist: General-Level and General-Bench
ICML 2025
Boosting Virtual Agent Learning and Reasoning: A Step-Wise, Multi-Dimensional, and Generalist Reward Model with Benchmark
ICML 2025
What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensional Benchmark for Essential Virtual Agent Capabilities
ICML 2025
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions
ICLR 2024
Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models
ICCV 2023