Zhuo Chen
96 papers · 2013–2026 · 14 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (23) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (6) π£ Hot Topic Early Bird
π
Renaissance Researcher
(6)
π
Interdisciplinary Bridge
π§
Keyword Pioneer
π
Keyword Trendsetter Combo
(3)
π
Conference Loyalist
(24)
π§¬
Topic Evolution
π€
Dynamic Duo
(16)
π
Grand Slam
π
Triple Crown
π¬
Deep Specialist
(12)
π
Keyword Champion
(2)
π₯
Unstoppable
(10)
β‘
Prolific Year
(9)
π
Trend Setter
ποΈ
Keyword Collector
(72)
π
Conference Pioneer
π
Century Club
(88)
β
The Questioner
(3)
Conferences
INTERSPEECH (24)
AAAI (14)
ACL (10)
CVPR (9)
NIPS (7)
ICLR (6)
ICML (6)
IJCAI (5)
COLING (4)
ICCV (4)
EMNLP (3)
ECCV (2)
NSDI (1)
WACV (1)
Top co-authors
Research topics
Keywords
large language model
(13)
speech separation
(8)
automatic speech recognition
(7)
knowledge graph
(6)
speaker diarization
(6)
zero-shot learning
(5)
depth estimation
(5)
self-supervised learning
(5)
knowledge graph completion
(5)
neural network
(4)
model compression
(4)
knowledge distillation
(4)
transformer architecture
(4)
graph neural network
(4)
multimodal learning
(4)
contrastive learning
(4)
speech enhancement
(4)
diffusion model
(4)
speech synthesis
(3)
representation learning
(3)
Papers
From Curated Data to Scalable Models: Continual Pre-training of Dense and MoE Large Language Models for Tibetan
ACL 2026
rMMEA: Robust Multi-Modal Entity Alignment with Missing and Noise Visual Modality
AAAI 2026
Multi-Modal Fact Knowledge Generation for Imbalanced Cross-Source Entity Alignment
AAAI 2026
UniHR: Hierarchical Representation Learning for Unified Knowledge Graph Link Prediction
AAAI 2026
Force-Aware 3D Contact Modeling for Stable Grasp Generation
AAAI 2026
HiSVD: Principled Low-Rank Approximation of LLMs via Hierarchical Modeling of Information Capacity and Spectral Structure
ACL 2026
Know the Known and the Unknown: Reasonable Answer Generation with Knowledge-Informed Citations
ACL 2026
PDTrim: Targeted Pruning for Prefill-Decode Disaggregation in Inference
ACL 2026
Multimodal Latent Diffusion Model for Complex Sewing Pattern Generation
ICCV 2025
Dataset Distillation as Data Compression: A Rate-Utility Perspective
ICCV 2025
AMDANet: Attention-Driven Multi-Perspective Discrepancy Alignment for RGB-Infrared Image Fusion and Segmentation
ICCV 2025
ELLA-V: Stable Neural Codec Language Modeling with Alignment-Guided Sequence Reordering
AAAI 2025
Which Tasks Should Be Compressed Together? A Causal Discovery Approach for Efficient Multi-Task Representation Compression
ICLR 2025
Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning
ICLR 2025
AniSDF: Fused-Granularity Neural Surfaces with Anisotropic Encoding for High-Fidelity 3D Reconstruction
ICLR 2025
Sounding that Object: Interactive Object-Aware Image to Audio Generation
ICML 2025
DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation
ICML 2025
Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking
ACL 2025
Advancing Zero-shot Text-to-Speech Intelligibility across Diverse Domains via Preference Alignment
ACL 2025
Towards Reliable Large Audio Language Model
ACL 2025
Graph-guided Cross-composition Feature Disentanglement for Compositional Zero-shot Learning
ACL 2025
Infer the Whole from a Glimpse of a Part: Keypoint-Based Knowledge Graph for Vehicle Re-Identification
AAAI 2025
Noise-powered Multi-modal Knowledge Graph Representation Framework
COLING 2025
K-ON: Stacking Knowledge on the Head Layer of Large Language Model
AAAI 2025
Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity Representation
AAAI 2025
Scaling Mesh Generation via Compressive Tokenization
CVPR 2025
Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems
CVPR 2025
One-for-More: Continual Diffusion Model for Anomaly Detection
CVPR 2025
DO-CoLM: Dynamic 3D Conformation Relationships Capture with Self-Adaptive Ordering Molecular Relational Modeling in Language Models
IJCAI 2025
Language Model Can Listen While Speaking
AAAI 2025
ExpTalk: Diverse Emotional Expression via Adaptive Disentanglement and Refined Alignment for Speech-Driven 3D Facial Animation
IJCAI 2025
Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference
EMNLP 2025
KBM: Delineating Knowledge Boundary for Adaptive Retrieval in Large Language Models
EMNLP 2025
FreeMesh: Boosting Mesh Generation with Coordinates Merging
ICML 2025
Self-Improvement Programming for Temporal Knowledge Graph Question Answering
COLING 2024
OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step
NIPS 2024
Dual-Diffusion for Binocular 3D Human Pose Estimation
NIPS 2024
QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation
NIPS 2024
Multi-times Monte Carlo Rendering for Inter-reflection Reconstruction
NIPS 2024
MKGL: Mastery of a Three-Word Language
NIPS 2024
Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations
AAAI 2024
Dual Mapping of 2D StyleGAN for 3D-Aware Image Generation and Manipulation (Student Abstract)
AAAI 2024
STViT: Improving Self-Supervised Multi-Camera Depth Estimation with Spatial-Temporal Context and Adversarial Geometry Regularization (Student Abstract)
AAAI 2024
Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering
ACL 2024
Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts
ACL 2024
DET: A Dual-Encoding Transformer for Relational Graph Embedding
COLING 2024
Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion
COLING 2024
3D-Aware Face Editing via Warping-Guided Latent Direction Learning
CVPR 2024
UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather
CVPR 2024
A Unified Image Compression Method for Human Perception and Multiple Vision Tasks
ECCV 2024
Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models
ICLR 2024
Revisit and Outstrip Entity Alignment: A Perspective of Generative Models
ICLR 2024
Domain-Agnostic Molecular Generation with Chemical Feedback
ICLR 2024
TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision
ICML 2024
Rethinking the Soft Conflict Pseudo Boolean Constraint on MaxSAT Local Search Solvers
IJCAI 2024
LLM-based Multi-Level Knowledge Generation for Few-shot Knowledge Graph Completion
IJCAI 2024
COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
INTERSPEECH 2024
TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers
INTERSPEECH 2024
ANTN: Bridging Autoregressive Neural Networks and Tensor Networks for Quantum Many-Body Simulation
NIPS 2023
Adaptive Patch Deformation for Textureless-Resilient Multi-View Stereo
CVPR 2023
Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
INTERSPEECH 2023
NewtonβCotes Graph Neural Networks: On the Time Evolution of Dynamic Systems
NIPS 2023
Speaker Diarization for ASR Output with T-vectors: A Sequence Classification Approach
INTERSPEECH 2023
DUET: Cross-Modal Semantic Grounding for Contrastive Zero-Shot Learning
AAAI 2023
Using Interpretation Methods for Model Enhancement
EMNLP 2023
ELFNet: Evidential Local-global Fusion for Stereo Matching
ICCV 2023
BEATs: Audio Pre-Training with Acoustic Tokenizers
ICML 2023
Collaboration of Experts: Achieving 80% Top-1 Accuracy on ImageNet with 100M FLOPs
ICML 2022
Molecular Contrastive Learning with Chemical Element Knowledge Graph
AAAI 2022
Structural Triangulation: A Closed-Form Solution to Constrained 3D Human Pose Estimation
ECCV 2022
Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
INTERSPEECH 2022
Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
INTERSPEECH 2022
Streaming Multi-Talker ASR with Token-Level Serialized Output Training
INTERSPEECH 2022
Separating Long-Form Speech with Group-wise Permutation Invariant Training
INTERSPEECH 2022
Knowledge-aware Zero-Shot Learning: Survey and Perspective
IJCAI 2021
Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement
INTERSPEECH 2021
Ultra Fast Speech Separation Model with Teacher Student Learning
INTERSPEECH 2021
Continuous Speech Separation Using Speaker Inventory for Long Recording
INTERSPEECH 2021
Investigation of Practical Aspects of Single Channel Speech Separation for ASR
INTERSPEECH 2021
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
INTERSPEECH 2021
Target-Speaker Voice Activity Detection with Improved i-Vector Estimation for Unknown Number of Speaker
INTERSPEECH 2021
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario
INTERSPEECH 2021
End-to-End Speaker-Attributed ASR with Transformer
INTERSPEECH 2021
Neural Speech Separation Using Spatially Distributed Microphones
INTERSPEECH 2020
An End-to-End Architecture of Online Multi-Channel Speech Separation
INTERSPEECH 2020
Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of any Number of Speakers
INTERSPEECH 2020
ViP: Virtual Pooling for Accelerating CNN-based Image Classification and Object Detection
WACV 2020
PuppeteerGAN: Arbitrary Portrait Animation With Semantic-Aware Appearance Transformation
CVPR 2020
Mesh-Guided Multi-View Stereo With Pyramid Architecture
CVPR 2020
Attention-Aware Multi-View Stereo
CVPR 2020
Meeting Transcription Using Asynchronous Distant Microphones
INTERSPEECH 2019
Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks
INTERSPEECH 2018
Improving Mask Learning Based Speech Enhancement System with Restoration Layers and Residual Connection
INTERSPEECH 2017
Adaptation of Neural Networks Constrained by Prior Statistics of Node Co-Activations
INTERSPEECH 2016
Single-Channel Multi-Speaker Separation Using Deep Clustering
INTERSPEECH 2016
Walkie-Markie: Indoor Pathway Mapping Made Easy
NSDI 2013