Artificial Intelligence › Core AI ›

Agent Systems

3885 directly classified papers

Papers per year

Papers

TANGO: Training-free Embodied AI Agents for Open-world Tasks CVPR 2025

ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark CVPR 2025

RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins CVPR 2025

PEACE: Empowering Geologic Map Holistic Understanding with MLLMs CVPR 2025

SketchAgent: Language-Driven Sequential Sketch Generation CVPR 2025

SpiritSight Agent: Advanced GUI Agent with One Look CVPR 2025

From Words to Structured Visuals: A Benchmark and Framework for Text-to-Diagram Generation and Editing CVPR 2025

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models CVPR 2025

GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill CVPR 2025

Scene Map-based Prompt Tuning for Navigation Instruction Generation CVPR 2025

Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents CVPR 2025

From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons CVPR 2025

SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model CVPR 2025

DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation CVPR 2025

R2C: Mapping Room to Chessboard to Unlock LLM As Low-Level Action Planner CVPR 2025

V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents CVPR 2025

Towards Autonomous Micromobility through Scalable Urban Simulation CVPR 2025

CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos CVPR 2025

GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration CVPR 2025

COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation ICCV 2025

PathFinder: A Multi-Modal Multi-Agent System for Medical Diagnostic Decision-Making Applied to Histopathology ICCV 2025

Simulating Human-like Daily Activities with Desire-driven Autonomy ICLR 2025

Learning Precise Affordances from Egocentric Videos for Robotic Manipulation ICCV 2025

SketchAgent: Generating Structured Diagrams from Hand-Drawn Sketches IJCAI 2025

NovPhy: A Physical Reasoning Benchmark for Open-World AI Systems Author Links Open Overlay Panel (Abstract Reprint) IJCAI 2025