Jialong Zuo
19 papers · 2023–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+7 more ↓ Show less ↑
π§ Keyword Pioneer π Renaissance Researcher (5) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (11) π Conference Polyglot (9)
π§
Keyword Pioneer
π
Cross-Pollinator
(11)
π€
Dynamic Duo
(11)
π
Keyword Champion
(2)
β‘
Prolific Year
(6)
ποΈ
Keyword Collector
(100)
π
Century Club
(18)
Conferences
ACL (6)
AAAI (3)
CVPR (2)
ICLR (2)
NIPS (2)
COLING (1)
EMNLP (1)
ICCV (1)
INTERSPEECH (1)
Top co-authors
Keywords
zero-shot learning
(4)
speech synthesis
(4)
contrastive learning
(2)
vector quantization
(2)
person re-identification
(2)
cross-modal retrieval
(2)
speech generation
(2)
speaker cloning
(2)
video anomaly detection
(2)
generative model
(2)
image retrieval
(1)
multimodal learning
(1)
cross-modal learning
(1)
autoregressive generation
(1)
anomaly detection
(1)
speech recognition
(1)
flow matching
(1)
domain generalization
(1)
semantic alignment
(1)
instruction following
(1)
Papers
Learning to Tell Apart: Weakly Supervised Video Anomaly Detection via Disentangled Semantic Alignment
AAAI 2026
L-Man: A Large Multi-modal Model Unifying Human-centric Tasks
AAAI 2025
Speech Watermarking with Discrete Intermediate Representations
AAAI 2025
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching
ACL 2025
Language-Codec: Bridging Discrete Codec Representations and Speech Language Models
ACL 2025
CART: A Generative Cross-Modal Retrieval Framework With Coarse-To-Fine Semantic Modeling
ACL 2025
VoxpopuliTTS: a large-scale multilingual TTS corpus for zero-shot speech generation
COLING 2025
Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity
CVPR 2025
Partial Forward Blocking: A Novel Data Pruning Paradigm for Lossless Training Acceleration
ICCV 2025
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup
ICLR 2025
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
ICLR 2025
ControlSpeech: Towards Simultaneous and Independent Zero-shot Speaker Cloning and Zero-shot Language Style Control
ACL 2025
MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
INTERSPEECH 2024
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech
ACL 2024
UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity
CVPR 2024
PLIP: Language-Image Pre-training for Person Representation Learning
NIPS 2024
AudioVSR: Enhancing Video Speech Recognition with Audio Data
EMNLP 2024
Cross-video Identity Correlating for Person Re-identification Pre-training
NIPS 2024
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models
ACL 2023