conftrace_

Zhihong Zhu

58 papers · 2023–2026 · 13 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+9 more ↓

🐝 Cross-Pollinator (4) 🌍 Conference Polyglot (13) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (10)

🌈 Renaissance Researcher (10) 🗺️ Taxonomy Completionist (100) 🔬 Deep Specialist (14) 🤝 Dynamic Duo (31) ⚡ Prolific Year (11) 🗃️ Keyword Collector (213) 💎 Century Club (54) 🚀 Conference Pioneer ❓ The Questioner (2)

Conferences

EMNLP (17) ACL (10) INTERSPEECH (7) AAAI (6) COLING (6) ICLR (3) ICCV (2) MICCAI (2) CVPR (1) ECCV (1) IJCAI (1) NAACL (1) NIPS (1)

Top co-authors

Xuxin Cheng (31) Yuexian Zou (30) Zhiqi Huang (16) Hongxiang Li (16) Xian Wu (15) Xianwei Zhuang (14) Yaowei Li (11) Yefeng Zheng (10) Yunyan Zhang (10) Zhanpeng Chen (7)

Research topics

Understanding (1)

Keywords

spoken language understanding (14) contrastive learning (10) multimodal learning (8) large language model (8) slot filling (7) intent detection (6) task-oriented dialogue (5) vision-language model (4) intent classification (4) attention mechanism (4) optimal transport (4) zero-shot learning (4) automatic speech recognition (4) reinforcement learning (3) causal inference (3) audio-text retrieval (3) hallucination mitigation (3) multi-task learning (3) benchmark evaluation (3) cross-lingual transfer (3)

Papers

S³-MSD: Large Vision-Language Model for Explainable and Generalizable Multi-modal Sarcasm Detection AAAI 2026 Beyond Surface Features: Advancing Medical Vision-Language Alignment via Dynamic Evidence-Guided Preference Optimization ACL 2026 MMErroR: A Benchmark for Erroneous Reasoning in Vision-Language Models ACL 2026 CMID: Towards Medical Visual Question Answering via Contrastive Mutual Information Decoding AAAI 2026 Can We Trust AI Doctors? A Survey of Medical Hallucination in Large Language and Large Vision-Language Models ACL 2025 HTML: Hierarchical Topology Multi-task Learning for Semantic Parsing in Knowledge Base Question Answering ACL 2025 VASparse: Towards Efficient Visual Hallucination Mitigation via Visual-Aware Token Sparsification CVPR 2025 Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation AAAI 2025 RTE-GMoE: A Model-agnostic Approach for Relation Triplet Extraction via Graph-based Mixture-of-Expert Mutual Learning EMNLP 2025 CMedCalc-Bench: A Fine-Grained Benchmark for Chinese Medical Calculations in LLM EMNLP 2025 $\text{D}_{2}\text{O}$: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models ICLR 2025 A Survey on Multi-modal Intent Recognition: Recent Advances and New Frontiers EMNLP 2025 UniCoTT: A Unified Framework for Structural Chain-of-Thought Distillation ICLR 2025 DisPose: Disentangling Pose Guidance for Controllable Human Image Animation ICLR 2025 A Survey on Foundation Language Models for Single-cell Biology ACL 2025 Relevance Is a Guiding Light: Relevance-aware Adaptive Learning for End-to-end Task-oriented Dialogue System EMNLP 2024 What are the Generator Preferences for End-to-end Task-Oriented Dialog System? EMNLP 2024 Dual-oriented Disentangled Network with Counterfactual Intervention for Multimodal Intent Detection EMNLP 2024 Game on Tree: Visual Hallucination Mitigation via Coarse-to-Fine View Tree and Game Theory EMNLP 2024 LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference EMNLP 2024 UniMEEC: Towards Unified Multimodal Emotion Recognition and Emotion Cause EMNLP 2024 Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding EMNLP 2024 Learning to Match Representations is Better for End-to-End Task-Oriented Dialog System EMNLP 2024 GPA: Global and Prototype Alignment for Audio-Text Retrieval INTERSPEECH 2024 Towards Multi-Intent Spoken Language Understanding via Hierarchical Attention and Optimal Transport AAAI 2024 Exploiting Auxiliary Caption for Video Grounding AAAI 2024 Aligner²: Enhancing Joint Multiple Intent Detection and Slot Filling via Adjustive and Forced Cross-Task Alignment AAAI 2024 Code-Switching Can be Better Aligners: Advancing Cross-Lingual SLU through Representation-Level and Prediction-Level Alignment ACL 2024 Cyclical Contrastive Learning Based on Geodesic for Zero-shot Cross-lingual Spoken Language Understanding ACL 2024 MoE-SLU: Towards ASR-Robust Spoken Language Understanding via Mixture-of-Experts ACL 2024 Alignment before Awareness: Towards Visual Question Localized-Answering in Robotic Surgery via Optimal Transport and Answer Semantics COLING 2024 InfoEnh: Towards Multimodal Sentiment Analysis via Information Bottleneck Filter and Optimal Transport Alignment COLING 2024 Knowledge-enhanced Prompt Tuning for Dialogue-based Relation Extraction with Trigger and Label Semantic COLING 2024 Multi-perspective Improvement of Knowledge Graph Completion with Large Language Models COLING 2024 Towards Multi-modal Sarcasm Detection via Disentangled Multi-grained Multi-modal Distilling COLING 2024 Zero-Shot Spoken Language Understanding via Large Language Models: A Preliminary Study COLING 2024 KDProR: A Knowledge-Decoupling Probabilistic Framework for Video-Text Retrieval ECCV 2024 DGLF: A Dual Graph-based Learning Framework for Multi-modal Sarcasm Detection EMNLP 2024 TFCD: Towards Multi-modal Sarcasm Detection via Training-Free Counterfactual Debiasing IJCAI 2024 Audio-text Retrieval with Transformer-based Hierarchical Alignment and Disentangled Cross-modal Representation INTERSPEECH 2024 DiffATR: Diffusion-based Generative Modeling for Audio-Text Retrieval INTERSPEECH 2024 MedJourney: Benchmark and Evaluation of Large Language Models over Patient Clinical Journey NIPS 2024 Multivariate Cooperative Game for Image-Report Pairs: Hierarchical Semantic Alignment for Medical Report Generation MICCAI 2024 Textual Inversion and Self-supervised Refinement for Radiology Report Generation MICCAI 2024 AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition NAACL 2024 C²A-SLU: Cross and Contrastive Attention for Improving ASR Robustness in Spoken Language Understanding INTERSPEECH 2023 GhostT5: Generate More Features with Cheap Operations to Improve Textless Spoken Question Answering INTERSPEECH 2023 Mix before Align: Towards Zero-shot Cross-lingual Sentiment Analysis via Soft-Mix and Multi-View Learning INTERSPEECH 2023 Towards Unified Spoken Language Understanding Decoding via Label-aware Compact Linguistics Representations ACL 2023 ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding ACL 2023 MCLF: A Multi-grained Contrastive Learning Framework for ASR-robust Spoken Language Understanding EMNLP 2023 Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology Report Generation ICCV 2023 G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory ICCV 2023 Syntax Matters: Towards Spoken Language Understanding via Syntax-Aware Attention EMNLP 2023 MRRL: Modifying the Reference via Reinforcement Learning for Non-Autoregressive Joint Multiple Intent Detection and Slot Filling EMNLP 2023 Accelerating Multiple Intent Detection and Slot Filling via Targeted Knowledge Distillation EMNLP 2023 Enhancing Code-Switching for Cross-lingual SLU: A Unified View of Semantic and Grammatical Coherence EMNLP 2023 FC-MTLF: A Fine- and Coarse-grained Multi-Task Learning Framework for Cross-Lingual Spoken Language Understanding INTERSPEECH 2023