Ya Li
25 papers · 2015–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (6) πΊοΈ Taxonomy Completionist (11) π£ Hot Topic Early Bird
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Interdisciplinary Bridge
π§¬
Topic Evolution
π
Keyword Champion
(2)
ποΈ
Keyword Collector
(116)
π
Trend Setter
π
Century Club
(24)
π₯
Unstoppable
(11)
π
Conference Pioneer
Conferences
INTERSPEECH (13)
AAAI (2)
CVPR (2)
ICML (2)
IJCAI (2)
ACL (1)
ECCV (1)
EMNLP (1)
NIPS (1)
Top co-authors
Keywords
diffusion model
(3)
text-to-speech synthesis
(3)
speech synthesis
(3)
knowledge distillation
(3)
phoneme embedding
(2)
large language model
(2)
prosodic boundary
(2)
audio codec
(2)
bidirectional lstm
(2)
word embedding
(2)
zero-shot learning
(2)
gradient boosting decision tree
(2)
singing voice conversion
(2)
attention mechanism
(2)
black-box attack
(2)
deep neural network
(2)
image classification
(2)
adversarial example
(2)
depression detection
(2)
neural network
(2)
Papers
HQ-SVC: Towards High-Quality Zero-Shot Singing Voice Conversion in Low-Resource Scenarios
AAAI 2026
OV-MER: Towards Open-Vocabulary Multimodal Emotion Recognition
ICML 2025
Controllable 3D Dance Generation Using Diffusion-Based Transformer U-Net
AAAI 2025
Beyond Surface Simplicity: Revealing Hidden Reasoning Attributes for Precise Commonsense Diagnosis
ACL 2025
Enhancing Modal Fusion by Alignment and Label Matching for Multimodal Emotion Recognition
INTERSPEECH 2024
SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion
INTERSPEECH 2024
Retrieval Augmented Generation in Prompt-based Text-to-Speech Synthesis with Context-Aware Contrastive Language-Audio Pretraining
INTERSPEECH 2024
Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model
INTERSPEECH 2024
FTA-net: A Frequency and Time Attention Network for Speech Depression Detection
INTERSPEECH 2023
Towards Lightweight Black-Box Attack Against Deep Neural Networks
NIPS 2022
ECAPA-TDNN Based Depression Detection from Clinical Speech
INTERSPEECH 2022
Cross Attention Augmented Transducer Networks for Simultaneous Translation
EMNLP 2021
Dual-Path Distillation: A Unified Framework to Improve Black-Box Attacks
ICML 2020
Transferable, Controllable, and Inconspicuous Adversarial Attacks on Person Re-identification With Deep Mis-Ranking
CVPR 2020
Compact Feature Learning for Multi-Domain Image Classification
CVPR 2019
Deep Domain Generalization via Conditional Invariant Adversarial Networks
ECCV 2018
BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in a Text-to-Speech Front-End
INTERSPEECH 2018
Speech Emotion Recognition from Variable-Length Inputs with Triplet Loss Function
INTERSPEECH 2018
Classification and Representation Joint Learning via Deep Networks
IJCAI 2017
Investigating Efficient Feature Representation Methods and Training Objective for BLSTM-Based Phone Duration Prediction
INTERSPEECH 2017
Distilling Knowledge from an Ensemble of Models for Punctuation Prediction
INTERSPEECH 2017
Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach
INTERSPEECH 2016
The Parameterized Phoneme Identity Feature as a Continuous Real-Valued Vector for Neural Network Based Speech Synthesis
INTERSPEECH 2016
The Rhythmic Constraint on Prosodic Boundaries in Mandarin Chinese Based on Corpora of Silent Reading and Speech Perception
INTERSPEECH 2016
Multi-Task Model and Feature Joint Learning
IJCAI 2015