conftrace_

Jun Song

18 papers · 2024–2026 · 7 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+5 more ↓ 🌍 Conference Polyglot (7) πŸ—ΊοΈ Taxonomy Completionist (32) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (7) πŸŒ‰ Interdisciplinary Bridge
🐝 Cross-Pollinator (11) πŸ”¬ Deep Specialist (10) πŸ—ƒοΈ Keyword Collector (74) πŸ’Ž Century Club (15) ⚑ Prolific Year (6)

Conferences

AAAI (7) ACL (5) EMNLP (2) CVPR (1) ICCV (1) ICML (1) NIPS (1)

Papers

InquireMobile: Teaching VLM-based Mobile Agent to Request Human Assistance via Reinforcement Fine-Tuning ACL 2026 Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models AAAI 2026 How Foundational Skills Influence VLM-based Embodied Agents: A Native Perspective AAAI 2026 LLaVA-UHD v2: Exploiting Hierarchical Vision Granularity in MLLMs via Inverse Semantic Pyramid AAAI 2026 MMG-Vid: Maximizing Marginal Gains at Segment-level and Token-level for Efficient Video LLMs AAAI 2026 Contribution-aware Token Compression for Efficient Video Understanding via Reinforcement Learning AAAI 2026 DeepPhy: Benchmarking Agentic VLMs on Physical Reasoning AAAI 2026 Unified Thinker: A General Reasoning Core for Image Generation ACL 2026 Mobile-R1: Towards Interactive Capability for VLM-Based Mobile Agent via Systematic Training ACL 2026 Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation EMNLP 2025 CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games ICCV 2025 LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and Locating ACL 2025 See the World, Discover Knowledge: A Chinese Factuality Evaluation for Large Vision Language Models ACL 2025 RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness CVPR 2025 POI Recommendation via Multi-Objective Adversarial Imitation Learning AAAI 2025 Enhancing Sufficient Dimension Reduction via Hellinger Correlation ICML 2024 GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation EMNLP 2024 Demystify Mamba in Vision: A Linear Attention Perspective NIPS 2024