Shengyu Zhang
42 papers · 2016–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (13) π Renaissance Researcher (6) π Interdisciplinary Bridge π£ Hot Topic Early Bird
πΊοΈ
Taxonomy Completionist
(13)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π€
Dynamic Duo
(18)
π₯
Mega-Team
(29)
β‘
Prolific Year
(11)
π₯
Unstoppable
(10)
ποΈ
Keyword Collector
(208)
β
The Questioner
π
Century Club
(35)
Conferences
AAAI (14)
ACL (7)
ICML (5)
IJCAI (5)
CVPR (4)
EMNLP (3)
ICLR (2)
EACL (1)
ECCV (1)
Top co-authors
Research topics
Keywords
video understanding
(4)
recommendation system
(4)
multimodal learning
(4)
graphical user interface
(3)
recommender system
(3)
multimodal large language model
(3)
reinforcement learning
(3)
video grounding
(3)
large language model
(3)
task automation
(2)
diffusion model
(2)
online algorithm
(2)
active learning
(2)
visual grounding
(2)
model merging
(2)
knowledge distillation
(2)
contrastive learning
(2)
causal inference
(2)
representation learning
(2)
self-supervised learning
(2)
Papers
Measure Twice, Click Once: Co-evolving Proposer and Visual Critic via Reinforcement Learning for GUI Grounding
ACL 2026
A Rolling Stone Gathers No Moss: Adaptive Policy Optimization for Stable Self-Evaluation in Large Multimodal Models
AAAI 2026
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection
EACL 2026
InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization
AAAI 2026
EcoAgent: An Efficient Device-Cloud Collaborative Multi-Agent Framework for Mobile Automation
AAAI 2026
AccKV: Towards Efficient Audio-Video LLMs Inference via Adaptive-Focusing and Cross-Calibration KV Cache Optimization
AAAI 2026
DAC-Bench: A Decision-Aware Benchmark for Compositional Mobile GUI Tasks
ACL 2026
Optimize Incompatible Parameters Through Compatibility-aware Knowledge Integration
AAAI 2025
Preliminary Evaluation of the Test-Time Training Layers in Recommendation System (Student Abstract)
AAAI 2025
OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use
ACL 2025
MadaKV: Adaptive Modality-Perception KV Cache Eviction for Efficient Multimodal Long-Context Inference
ACL 2025
Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving
EMNLP 2025
EcoFace: Audio-Visual Emotional Co-Disentanglement Speech-Driven 3D Talking Face Generation
ICLR 2025
Device-Cloud Collaborative Correction for On-Device Recommendation
IJCAI 2025
ExpTalk: Diverse Emotional Expression via Adaptive Disentanglement and Refined Alignment for Speech-Driven 3D Facial Animation
IJCAI 2025
Quantum Algorithms for Finite-horizon Markov Decision Processes
ICML 2025
MergeNet: Knowledge Migration Across Heterogeneous Models, Tasks, and Modalities
AAAI 2025
FedCFA: Alleviating Simpsonβs Paradox in Model Aggregation with Counterfactual Federated Learning
AAAI 2025
MPOD123: One Image to 3D Content Generation Using Mask-enhanced Progressive Outline-to-Detail Optimization
CVPR 2024
LLMCO4MR: LLMs-aided Neural Combinatorial Optimization for Ancient Manuscript Restoration from Fragments with Case Studies on Dunhuang
ECCV 2024
CoreRec: A Counterfactual Correlation Inference for Next Set Recommendation
AAAI 2024
PhiloGPT: A Philology-Oriented Large Language Model for Ancient Chinese Manuscripts with Dunhuang as Case Study
EMNLP 2024
AuG-KD: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation
ICLR 2024
WINNER: Weakly-Supervised hIerarchical decompositioN and aligNment for Spatio-tEmporal Video gRounding
CVPR 2023
Video-Audio Domain Generalization via Confounder Disentanglement
AAAI 2023
Multi-modal Action Chain Abductive Reasoning
ACL 2023
Weakly-Supervised Spoken Video Grounding via Semantic Interaction Learning
ACL 2023
Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-Based Active Learning
CVPR 2023
ART: rule bAsed futuRe-inference deducTion
EMNLP 2023
BoostMIS: Boosting Medical Image Semi-Supervised Learning With Adaptive Pseudo Labeling and Informative Active Annotation
CVPR 2022
Retroformer: Pushing the Limits of End-to-end Retrosynthesis Transformer
ICML 2022
MAGIC: Multimodal relAtional Graph adversarIal inferenCe for Diverse and Unpaired Text-Based Image Captioning
AAAI 2022
The Secretary Problem with Competing Employers on Random Edge Arrivals
AAAI 2022
End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding
ACL 2022
Modeling High-order Interactions across Multi-interests for Micro-video Reommendation (Student Abstract)
AAAI 2021
Adaptive Double-Exploration Tradeoff for Outlier Detection
AAAI 2020
Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels
ICML 2019
Policy Optimization with Second-Order Advantage Information
IJCAI 2018
Learning to Aggregate Ordinal Labels by Maximizing Separating Width
ICML 2017
Networked Fairness in Cake Cutting
IJCAI 2017
Online Roommate Allocation Problem
IJCAI 2017
Contextual Combinatorial Cascading Bandits
ICML 2016