Zhenfei Yin

19 papers · 2020–2026 · 8 conferences · across top CS/AI conferences

Achievements

+8 more ↓

🏃 Academic Marathon (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (8) 🐝 Cross-Pollinator (13)

🗺️ Taxonomy Completionist (25) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🤝 Dynamic Duo (10) ⚡ Prolific Year (8) 📈 Trend Setter 💎 Century Club (16) 🗃️ Keyword Collector (58)

Conferences

ACL (5) ECCV (4) ICCV (3) CVPR (2) ICML (2) EMNLP (1) ICLR (1) NIPS (1)

Top co-authors

Jing Shao (10) LEI BAI (6) Lu Sheng (5) Zhiyong Wang (4) Yu Qiao (4) Yiran Qin (4) Ziwei Liu (3) Wanli Ouyang (3) Ruimao Zhang (3) Li Kang (2)

Keywords

large language model (7) multi-agent system (3) vision language model (2) multi-modal learning (2) temporal dynamics (1) video generation (1) image generation (1) chain-of-thought reasoning (1) robotic manipulation (1) preference alignment (1) active perception (1) embodied ai (1) instruction tuning (1) safety alignment (1) frame selection (1) mutual information (1) entropy reduction (1) mathematical reasoning (1) reward model (1) imitation learning (1)

Papers

Rethinking the Role of Entropy in Optimizing Tool-Use Behaviors for Large Language Model Agents ACL 2026 Scaling Behaviors of LLM Reinforcement Learning Post-Training: An Empirical Study in Mathematical Reasoning ACL 2026 From Word to World: Can Large Language Models be Implicit Text-based World Models? ACL 2026 ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks EMNLP 2025 Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System ACL 2025 SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Models CVPR 2025 B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens ICCV 2025 VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Prior ICCV 2025 RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints ICCV 2025 WorldSimBench: Towards Video Generation Models as World Simulators ICML 2025 MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems ICML 2025 Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models ACL 2024 Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE ICLR 2024 Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models ECCV 2024 MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception CVPR 2024 LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark NIPS 2023 X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation ECCV 2022 Benchmarking Omni-Vision Representation through the Lens of Visual Realms ECCV 2022 CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations ECCV 2020