Yueting Zhuang
109 papers · 2015–2026 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird πΊοΈ Taxonomy Completionist (15) π Interdisciplinary Bridge π Conference Polyglot (13)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(15)
π§
Keyword Pioneer
π
Keyword Trendsetter Combo
(5)
π
Grand Slam
π¬
Deep Specialist
(19)
π
Triple Crown
π§¬
Topic Evolution
π
Keyword Champion
π±
Topic Pioneer
π€
Dynamic Duo
(47)
β
The Questioner
π
Conference Pioneer
β‘
Prolific Year
(7)
π₯
Unstoppable
(11)
π
Century Club
(103)
π
Trend Setter
ποΈ
Keyword Collector
(423)
Conferences
ACL (20)
IJCAI (17)
CVPR (14)
AAAI (13)
ICCV (11)
EMNLP (9)
ICML (7)
NIPS (7)
IJCNLP (4)
NAACL (3)
SEMEVAL (2)
AACL (1)
ICLR (1)
Top co-authors
Keywords
large language model
(10)
named entity recognition
(8)
representation learning
(7)
multimodal learning
(7)
reinforcement learning
(6)
attention mechanism
(6)
knowledge base
(6)
multimodal large language model
(5)
knowledge distillation
(5)
few-shot learning
(5)
domain adaptation
(5)
object detection
(4)
vision-language model
(4)
distant supervision
(4)
multi-instance learning
(4)
relation extraction
(4)
unsupervised learning
(4)
instruction tuning
(4)
visual reasoning
(3)
model compression
(3)
Papers
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks
ACL 2026
Seeing but Not Thinking: Routing Distraction in Multimodal Mixture-of-Experts
ACL 2026
MoA: Heterogeneous Mixture of Adapters for Parameter-Efficient Fine-Tuning of Large Language Models
ACL 2026
GUI-GΒ²: Gaussian Reward Modeling for GUI Grounding
AAAI 2026
UI-Copilot: Advancing Long-Horizon GUI Automation via Tool-Integrated Policy Optimization
ACL 2026
Experience-driven Multi-turn Reinforcement Learning for GUI Agents
ACL 2026
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
ICCV 2025
Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
ICCV 2025
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
ICCV 2025
What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensional Benchmark for Essential Virtual Agent Capabilities
ICML 2025
HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation
ICML 2025
Logic Distillation: Learning from Code Function by Function for Decision-making Tasks
IJCAI 2025
Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models
AAAI 2025
ProSwitch: Knowledge-Guided Instruction Tuning to Switch Between Professional and Non-Professional Responses
AACL 2025
ProSwitch: Knowledge-Guided Instruction Tuning to Switch Between Professional and Non-Professional Responses
IJCNLP 2025
TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition
ACL 2025
Align2LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation
ACL 2025
Meta-Reflection: A Feedback-Free Reflection Learning Framework
ACL 2025
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
CVPR 2025
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
CVPR 2025
STEP: Enhancing Video-LLMs' Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training
CVPR 2025
Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness
ICCV 2025
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data
CVPR 2024
Data Shunt: Collaboration of Small and Large Models for Lower Costs and Better Performance
AAAI 2024
T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text
ACL 2024
Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives
ACL 2024
Learning Global Controller in Latent Space for Parameter-Efficient Fine-Tuning
ACL 2024
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
ACL 2024
Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
ICML 2024
Revisiting the Domain Shift and Sample Uncertainty in Multi-source Active Domain Transfer
CVPR 2024
Auto-Encoding Morph-Tokens for Multimodal LLM
ICML 2024
TaskBench: Benchmarking Large Language Models for Task Automation
NIPS 2024
Triad: A Framework Leveraging a Multi-Role LLM-based Agent to Solve Knowledge Base Question Answering
EMNLP 2024
Bridging Local Details and Global Context in Text-Attributed Graphs
EMNLP 2024
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
EMNLP 2024
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions
ICLR 2024
DiffusionNER: Boundary Diffusion for Named Entity Recognition
ACL 2023
PromptNER: Prompt Locating and Typing for Named Entity Recognition
ACL 2023
DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity Recognition
ACL 2023
Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models
ICCV 2023
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
NIPS 2023
Continual Vision-Language Representation Learning with Off-Diagonal Information
ICML 2023
Unsupervised Prompt Tuning for Text-Driven Object Detection
ICCV 2023
Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models
NIPS 2023
Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World
ICCV 2023
Learning in Imperfect Environment: Multi-Label Classification with Long-Tailed Distribution and Partial Labels
ICCV 2023
DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity Recognition
SEMEVAL 2023
DAMO-NLP at SemEval-2022 Task 11: A Knowledge-based System for Multilingual Named Entity Recognition
NAACL 2022
DAMO-NLP at SemEval-2022 Task 11: A Knowledge-based System for Multilingual Named Entity Recognition
SEMEVAL 2022
Parallel Instance Query Network for Named Entity Recognition
ACL 2022
On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals
AAAI 2022
Learning Domain Adaptive Object Detection with Probabilistic Teacher
ICML 2022
Label Matching Semi-Supervised Object Detection
CVPR 2022
Slimmable Domain Adaptation
CVPR 2022
Learning To Learn by Jointly Optimizing Neural Architecture and Weights
CVPR 2022
Compositional Temporal Grounding With Structured Variational Cross-Graph Correspondence Learning
CVPR 2022
Fine-Grained Semantically Aligned Vision-Language Pre-Training
NIPS 2022
MAGIC: Multimodal relAtional Graph adversarIal inferenCe for Diverse and Unpaired Text-Based Image Captioning
AAAI 2022
Query-based Instance Discrimination Network for Relational Triple Extraction
EMNLP 2022
Robust Meta-learning with Sampling Noise and Label Noise via Eigen-Reptile
ICML 2022
Empower Distantly Supervised Relation Extraction with Collaborative Adversarial Training
AAAI 2021
Learning to Generate Visual Questions with Noisy Supervision
NIPS 2021
Consensus Graph Representation Learning for Better Grounded Image Captioning
AAAI 2021
A Free Lunch for Unsupervised Domain Adaptive Object Detection without Source Data
AAAI 2021
Disentangled Motif-aware Graph Learning for Phrase Grounding
AAAI 2021
CIL: Contrastive Instance Learning Framework for Distantly Supervised Relation Extraction
ACL 2021
Natural Language Video Localization with Learnable Moment Proposals
EMNLP 2021
Semi-Supervised Active Learning for Semi-Supervised Models: Exploit Adversarial Examples With Graph-Based Virtual Labels
ICCV 2021
Adaptive Hierarchical Graph Reasoning With Semantic Coherence for Video-and-Language Inference
ICCV 2021
A Sequence-to-Set Network for Nested Named Entity Recognition
IJCAI 2021
CIL: Contrastive Instance Learning Framework for Distantly Supervised Relation Extraction
IJCNLP 2021
Neural-DINF: A Neural Network based Framework for Measuring Document Influence
ACL 2020
Pixel-Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation
NIPS 2020
Hierarchical Attention Based Spatial-Temporal Graph-to-Sequence Learning for Grounded Video Description
IJCAI 2020
Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation
CVPR 2020
Counterfactual Samples Synthesizing for Robust Visual Question Answering
CVPR 2020
De-Biased Courtβs View Generation with Causality
EMNLP 2020
Time2Graph: Revisiting Time Series Modeling with Dynamic Shapelets
AAAI 2020
Weak Supervision Enhanced Generative Network for Question Generation
IJCAI 2019
Learning Dynamic Context Augmentation for Global Entity Linking
EMNLP 2019
Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction
CVPR 2019
Video Dialog via Progressive Inference and Cross-Transformer
EMNLP 2019
Cross-Relation Cross-Bag Attention for Distantly-Supervised Relation Extraction
AAAI 2019
Learning Dynamic Context Augmentation for Global Entity Linking
IJCNLP 2019
Video Dialog via Progressive Inference and Cross-Transformer
IJCNLP 2019
Posterior-regularized REINFORCE for Instance Selection in Distant Supervision
NAACL 2019
Improving Distantly-supervised Entity Typing with Compact Latent Space Clustering
NAACL 2019
KCAT: A Knowledge-Constraint Typing Annotation Tool
ACL 2019
Heterogeneous Attributed Network Embedding with Graph Convolutional Networks
AAAI 2019
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering
AAAI 2019
Attentional Image Retweet Modeling via Multi-Faceted Ranking Network Learning
IJCAI 2018
Deep Convolutional Neural Networks with Merge-and-Run Mappings
IJCAI 2018
Semantic Locality-Aware Deformable Network for Clothing Segmentation
IJCAI 2018
Feature Enhancement in Attention for Visual Question Answering
IJCAI 2018
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference
ACL 2018
MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models
NIPS 2018
Open-Ended Long-form Video Question Answering via Adaptive Hierarchical Reinforced Networks
IJCAI 2018
Deeply-Learned Part-Aligned Representations for Person Re-Identification
ICCV 2017
NITE: A Neural Inductive Teaching Framework for Domain Specific NER
EMNLP 2017
Zero-Shot Recognition Using Dual Visual-Semantic Mapping Paths
CVPR 2017
Video Question Answering via Hierarchical Spatio-Temporal Attention Networks
IJCAI 2017
Link Prediction via Ranking Metric Dual-Level Attention Network Learning
IJCAI 2017
Microblog Sentiment Classiο¬cation via Recurrent Random Walk Network Learning
IJCAI 2017
Diverse Image Captioning via GroupTalk
IJCAI 2016
Expert Finding for Community-Based Question Answering via Ranking Metric Network Learning
IJCAI 2016
Hierarchical Recurrent Neural Encoder for Video Representation With Application to Captioning
CVPR 2016
Self-Paced Boost Learning for Classification
IJCAI 2016
Mobile Query Recommendation via Tensor Function Learning
IJCAI 2015
Sketch the Storyline with CHARCOAL: A Non-Parametric Approach
IJCAI 2015