conftrace_

Xiaoda Yang

13 papers · 2024–2026 · 6 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+5 more ↓

🌈 Renaissance Researcher (6) 🗺️ Taxonomy Completionist (29) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6)

🐝 Cross-Pollinator (14) 🗃️ Keyword Collector (64) ⚡ Prolific Year (9) 💎 Century Club (11) ❓ The Questioner

Conferences

AAAI (3) EMNLP (3) ICLR (3) ACL (2) COLING (1) WACV (1)

Top co-authors

Xize Cheng (7) Shengpeng Ji (6) Jialong Zuo (5) Minghui Fang (5) Zhou Zhao (5) Tao Jin (4) Ziyue Jiang (4) Zehan Wang (3) Weicai Yan (2) Donglin Huang (2)

Keywords

generative model (2) benchmark evaluation (1) human motion synthesis (1) pose estimation (1) voice conversion (1) speech synthesis (1) speech recognition (1) autoregressive generation (1) multimodal learning (1) temporal information (1) cross-modal retrieval (1) flow matching (1) deep learning (1) speaker recognition (1) image generation (1) visual speech recognition (1) zero-shot learning (1) diffusion model (1) feature fusion (1) neural decoding (1)

Papers

SpatialLogic-Bench: A Diagnostic Benchmark for Task-Oriented Spatiotemporal Reasoning AAAI 2026 VividAnimator: An End-to-End Audio and Pose-driven Half-Body Human Animation Framework WACV 2026 Towards Efficient and Robust Manipulation via Multi-Frame Vision-Language-Action Modeling AAAI 2026 VoxpopuliTTS: a large-scale multilingual TTS corpus for zero-shot speech generation COLING 2025 PACHAT: Persona-Aware Speech Assistant for Multi-party Dialogue EMNLP 2025 Storynizor: Consistent Story Generation via Inter-Frame Synchronized and Shuffled ID Injection AAAI 2025 VoxDialogue: Can Spoken Dialogue Systems Understand Information Beyond Words? ICLR 2025 Diff-Prompt: Diffusion-Driven Prompt Generator with Mask Supervision ICLR 2025 WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling ICLR 2025 BrainLoc: Brain Signal-Based Object Detection with Multi-modal Alignment EMNLP 2025 CART: A Generative Cross-Modal Retrieval Framework With Coarse-To-Fine Semantic Modeling ACL 2025 Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching ACL 2025 AudioVSR: Enhancing Video Speech Recognition with Audio Data EMNLP 2024