Shengyu Zhang

42 papers · 2016–2026 · 9 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (13) 🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird

🗺️ Taxonomy Completionist (13) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🤝 Dynamic Duo (18) 👥 Mega-Team (29) ⚡ Prolific Year (11) 🔥 Unstoppable (10) 🗃️ Keyword Collector (208) ❓ The Questioner 💎 Century Club (35)

Conferences

AAAI (14) ACL (7) ICML (5) IJCAI (5) CVPR (4) EMNLP (3) ICLR (2) EACL (1) ECCV (1)

Top co-authors

Fei Wu (21) Zhou Zhao (12) Mengze Li (8) Wenqiao Zhang (8) Zheqi Lv (6) Kun Kuang (6) Jiwei Li (5) Xueyu Hu (5) Fan Wu (5) Chengfei Lv (5)

Research topics

Digital Humanities (1)

Keywords

video understanding (4) recommendation system (4) multimodal learning (4) graphical user interface (3) recommender system (3) multimodal large language model (3) reinforcement learning (3) video grounding (3) large language model (3) task automation (2) diffusion model (2) online algorithm (2) active learning (2) visual grounding (2) model merging (2) knowledge distillation (2) contrastive learning (2) causal inference (2) representation learning (2) self-supervised learning (2)

Papers

Measure Twice, Click Once: Co-evolving Proposer and Visual Critic via Reinforcement Learning for GUI Grounding ACL 2026 A Rolling Stone Gathers No Moss: Adaptive Policy Optimization for Stable Self-Evaluation in Large Multimodal Models AAAI 2026 InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection EACL 2026 InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization AAAI 2026 EcoAgent: An Efficient Device-Cloud Collaborative Multi-Agent Framework for Mobile Automation AAAI 2026 AccKV: Towards Efficient Audio-Video LLMs Inference via Adaptive-Focusing and Cross-Calibration KV Cache Optimization AAAI 2026 DAC-Bench: A Decision-Aware Benchmark for Compositional Mobile GUI Tasks ACL 2026 Optimize Incompatible Parameters Through Compatibility-aware Knowledge Integration AAAI 2025 Preliminary Evaluation of the Test-Time Training Layers in Recommendation System (Student Abstract) AAAI 2025 OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use ACL 2025 MadaKV: Adaptive Modality-Perception KV Cache Eviction for Efficient Multimodal Long-Context Inference ACL 2025 Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving EMNLP 2025 EcoFace: Audio-Visual Emotional Co-Disentanglement Speech-Driven 3D Talking Face Generation ICLR 2025 Device-Cloud Collaborative Correction for On-Device Recommendation IJCAI 2025 ExpTalk: Diverse Emotional Expression via Adaptive Disentanglement and Refined Alignment for Speech-Driven 3D Facial Animation IJCAI 2025 Quantum Algorithms for Finite-horizon Markov Decision Processes ICML 2025 MergeNet: Knowledge Migration Across Heterogeneous Models, Tasks, and Modalities AAAI 2025 FedCFA: Alleviating Simpson’s Paradox in Model Aggregation with Counterfactual Federated Learning AAAI 2025 MPOD123: One Image to 3D Content Generation Using Mask-enhanced Progressive Outline-to-Detail Optimization CVPR 2024 LLMCO4MR: LLMs-aided Neural Combinatorial Optimization for Ancient Manuscript Restoration from Fragments with Case Studies on Dunhuang ECCV 2024 CoreRec: A Counterfactual Correlation Inference for Next Set Recommendation AAAI 2024 PhiloGPT: A Philology-Oriented Large Language Model for Ancient Chinese Manuscripts with Dunhuang as Case Study EMNLP 2024 AuG-KD: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation ICLR 2024 WINNER: Weakly-Supervised hIerarchical decompositioN and aligNment for Spatio-tEmporal Video gRounding CVPR 2023 Video-Audio Domain Generalization via Confounder Disentanglement AAAI 2023 Multi-modal Action Chain Abductive Reasoning ACL 2023 Weakly-Supervised Spoken Video Grounding via Semantic Interaction Learning ACL 2023 Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-Based Active Learning CVPR 2023 ART: rule bAsed futuRe-inference deducTion EMNLP 2023 BoostMIS: Boosting Medical Image Semi-Supervised Learning With Adaptive Pseudo Labeling and Informative Active Annotation CVPR 2022 Retroformer: Pushing the Limits of End-to-end Retrosynthesis Transformer ICML 2022 MAGIC: Multimodal relAtional Graph adversarIal inferenCe for Diverse and Unpaired Text-Based Image Captioning AAAI 2022 The Secretary Problem with Competing Employers on Random Edge Arrivals AAAI 2022 End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding ACL 2022 Modeling High-order Interactions across Multi-interests for Micro-video Reommendation (Student Abstract) AAAI 2021 Adaptive Double-Exploration Tradeoff for Outlier Detection AAAI 2020 Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels ICML 2019 Policy Optimization with Second-Order Advantage Information IJCAI 2018 Learning to Aggregate Ordinal Labels by Maximizing Separating Width ICML 2017 Networked Fairness in Cake Cutting IJCAI 2017 Online Roommate Allocation Problem IJCAI 2017 Contextual Combinatorial Cascading Bandits ICML 2016