Yong Xu
87 papers · 2014–2026 · 14 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (21) π Renaissance Researcher (6) π Interdisciplinary Bridge π Conference Polyglot (14)
π
Conference Polyglot
(14)
π
Academic Marathon
(11)
π
Cross-Pollinator
(11)
π
Conference Loyalist
(22)
π
Keyword Champion
(2)
π€
Dynamic Duo
(17)
π¬
Deep Specialist
(15)
π
Grand Slam
π§¬
Topic Evolution
ποΈ
Keyword Collector
(57)
π₯
Unstoppable
(12)
β‘
Prolific Year
(18)
π
Conference Pioneer
π
Trend Setter
π
Century Club
(83)
Conferences
AAAI (22)
INTERSPEECH (17)
CVPR (12)
NIPS (7)
ICML (6)
IJCAI (6)
ACL (5)
ICCV (4)
COLING (2)
ECCV (2)
EMNLP (1)
ICLR (1)
MICCAI (1)
NAACL (1)
Top co-authors
Keywords
speech separation
(9)
attention mechanism
(8)
graph neural network
(7)
large language model
(7)
representation learning
(6)
multi-view learning
(5)
diabetic retinopathy
(4)
convolutional neural network
(4)
multimodal learning
(4)
multi-label classification
(4)
zero-shot learning
(4)
multi-view clustering
(4)
incomplete multi-view
(3)
generative adversarial network
(3)
image restoration
(3)
domain adaptation
(3)
speech enhancement
(3)
prompt engineering
(3)
contrastive learning
(3)
speech recognition
(3)
Papers
Towards Zero-Shot Diabetic Retinopathy Grading: Learning Generalized Knowledge via Prompt-Driven Matching and Emulating
AAAI 2026
Vision-Language Models Guided Graph Concept Reasoning for Interpretable Diabetic Retinopathy Diagnosis
AAAI 2026
PA-FAS: Towards Interpretable and Generalizable Multimodal Face Anti-Spoofing via Path-Augmented Reinforcement Learning
AAAI 2026
Frequency-Aligned Cross-Modal Learning with Top-K Wavelet Fusion and Dynamic Expert Routing for Enhanced Retinal Disease Diagnosis
AAAI 2026
RetouchGPT: LLM-based Interactive High-Fidelity Face Retouching via Imperfection Prompting
AAAI 2025
Federated Weakly Supervised Video Anomaly Detection with Multimodal Prompt
AAAI 2025
Self-Correcting Robot Manipulation via Gaussian-Splatted Foresight
AAAI 2025
Hard Sample Mining-based Tongue Diagnosis for Fatty Liver Disease Severity Classification
MICCAI 2025
Ex-VAD: Explainable Fine-grained Video Anomaly Detection Based on Visual-Language Models
ICML 2025
Mutual Learning for SAM Adaptation: A Dual Collaborative Network Framework for Source-Free Domain Transfer
ICML 2025
Base-Detail Feature Learning Framework for Visible-Infrared Person Re-Identification
IJCAI 2025
LogiGraph: Logical Reasoning with Contrastive Learning and Lightweight Graph Networks
COLING 2025
EducationQ: Evaluating LLMsβ Teaching Capabilities Through Multi-Agent Dialogue Framework
ACL 2025
Spectral Compressive Imaging via Unmixing-driven Subspace Diffusion Refinement
ICLR 2025
Generator-Assistant Stepwise Rollback Framework for Large Language Model Agent
EMNLP 2025
Zero-Shot Low-Light Image Enhancement via Latent Diffusion Models
AAAI 2025
Deep Hierarchies and Invariant Disease-Indicative Feature Learning for Computer Aided Diagnosis of Multiple Fundus Diseases
AAAI 2025
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
AAAI 2025
FlashST: A Simple and Universal Prompt-Tuning Framework for Traffic Prediction
ICML 2024
Zero-Shot Event-Intensity Asymmetric Stereo via Visual Prompting from Image Domain
NIPS 2024
MambaSCI: Efficient Mamba-UNet for Quad-Bayer Patterned Video Snapshot Compressive Imaging
NIPS 2024
Multi-Channel Multi-Speaker ASR Using Target Speakerβs Solo Segment
INTERSPEECH 2024
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
INTERSPEECH 2024
HACDR-Net: Heterogeneous-Aware Convolutional Network for Diabetic Retinopathy Multi-Lesion Segmentation
AAAI 2024
Attention-Induced Embedding Imputation for Incomplete Multi-View Partial Multi-Label Classification
AAAI 2024
QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback based Self-Correction
ACL 2024
Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation
ACL 2024
Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments
ACL 2024
Unsupervised Sign Language Translation and Generation
ACL 2024
AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework
COLING 2024
Diffusion-based Missing-view Generation With the Application on Incomplete Multi-view Clustering
ICML 2024
Partial Multi-View Multi-Label Classification via Semantic Invariance Learning and Prototype Modeling
ICML 2024
Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition
ICML 2024
VRetouchEr: Learning Cross-frame Feature Interdependence with Imperfection Flow for Face Retouching in Videos
CVPR 2024
Text-conditional Attribute Alignment across Latent Spaces for 3D Controllable Face Image Synthesis
CVPR 2024
"Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance"
ECCV 2024
Highly Confident Local Structure Based Consensus Graph Learning for Incomplete Multi-View Clustering
CVPR 2023
CIGAR: Cross-Modality Graph Reasoning for Domain Adaptive Object Detection
CVPR 2023
MVCINN: Multi-View Diabetic Retinopathy Detection Using a Deep Cross-Interaction Neural Network
AAAI 2023
Coherent Event Guided Low-Light Video Enhancement
ICCV 2023
Incomplete Multi-View Multi-Label Learning via Label-Guided Masked View- and Category-Aware Transformers
AAAI 2023
DICNet: Deep Instance-Level Contrastive Network for Double Incomplete Multi-View Multi-Label Classification
AAAI 2023
Zoneformer: On-device Neural Beamformer For In-car Multi-zone Speech Separation, Enhancement and Echo Cancellation
INTERSPEECH 2023
GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks
NIPS 2023
Masked Two-channel Decoupling Framework for Incomplete Multi-view Weak Multi-label Learning
NIPS 2023
Streamable Speech Representation Disentanglement and Multi-Level Prosody Modeling for Live One-Shot Voice Conversion
INTERSPEECH 2022
SwinTrack: A Simple and Strong Baseline for Transformer Tracking
NIPS 2022
Fine-Grained Object Classification via Self-Supervised Pose Alignment
CVPR 2022
SphericGAN: Semi-Supervised Hyper-Spherical Generative Adversarial Networks for Fine-Grained Image Synthesis
CVPR 2022
CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for Single-Corpus and Cross-Corpus Speech Emotion Recognition
IJCAI 2022
Joint Neural AEC and Beamforming with Double-Talk Detection
INTERSPEECH 2022
Audio Visual Multi-Speaker Tracking with Improved GCF and PMBM Filter
INTERSPEECH 2022
Encoding Spatial Distribution of Convolutional Features for Texture Representation
NIPS 2021
Traffic Flow Forecasting with Spatial-Temporal Graph Diffusion Network
AAAI 2021
Knowledge-aware Coupled Graph Neural Network for Social Recommendation
AAAI 2021
Spatial-Temporal Sequential Hypergraph Network for Crime Prediction with Dynamic Multiplex Relation Learning
IJCAI 2021
Dual-Octave Convolution for Accelerated Parallel MR Image Reconstruction
AAAI 2021
Graph-Enhanced Multi-Task Learning of Multi-Level Transition Dynamics for Session-based Recommendation
AAAI 2021
Knowledge-Enhanced Hierarchical Graph Transformer Network for Multi-Behavior Recommendation
AAAI 2021
Unified Tensor Framework for Incomplete Multi-view Clustering and Missing-view Inferring
AAAI 2021
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation
INTERSPEECH 2021
MIMO Self-Attentive RNN Beamformer for Multi-Speaker Speech Separation
INTERSPEECH 2021
MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment
INTERSPEECH 2021
Generalized Spatio-Temporal RNN Beamformer for Target Speech Separation
INTERSPEECH 2021
Deep Texture Recognition via Exploiting Cross-Layer Statistical Self-Similarity
CVPR 2021
Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation
ICCV 2021
Hypergraph Neural Networks for Hypergraph Matching
ICCV 2021
CDIMC-net: Cognitive Deep Incomplete Multi-view Clustering Network
IJCAI 2020
Neural Spatio-Temporal Beamformer for Target Speech Separation
INTERSPEECH 2020
Audio-Visual Multi-Channel Recognition of Overlapped Speech
INTERSPEECH 2020
Improved Speaker-Dependent Separation for CHiME-5 Challenge
INTERSPEECH 2019
Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information
INTERSPEECH 2019
LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking
CVPR 2019
A Comprehensive Study of Speech Separation: Spectrogram vs Waveform Separation
INTERSPEECH 2019
Single-Channel Signal Separation and Deconvolution with Generative Adversarial Networks
IJCAI 2019
Unified Embedding Alignment with Missing Views Inferring for Incomplete Multi-View Clustering
AAAI 2019
Adaptive GNN for Image Analysis and Editing
NIPS 2019
Highly-Economized Multi-View Binary Compression for Scalable Image Clustering
ECCV 2018
Bidirectional Attentive Fusion With Context Gating for Dense Video Captioning
CVPR 2018
Intelligibilities of Mandarin Chinese Sentences with Spectral βHolesβ
INTERSPEECH 2017
Attention and Localization Based on a Deep Convolutional Recurrent Model for Weakly Supervised Audio Tagging
INTERSPEECH 2017
Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation
CVPR 2017
Beyond Object Recognition: Visual Sentiment Analysis with Deep Coupled Adjective and Noun Neural Networks
IJCAI 2016
TransRead: Designing a Bilingual Reading Experience with Machine Translation Technologies
NAACL 2016
Sparse Coding for Classification via Discrimination Ensemble
CVPR 2016
Removing Rain From a Single Image via Discriminative Sparse Coding
ICCV 2015
Lacunarity Analysis on Image Patterns for Texture Classification
CVPR 2014