Lei He
48 papers · 2014–2026 · 14 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (12) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (6) π Conference Polyglot (14)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(12)
π§
Keyword Pioneer
π
Conference Loyalist
(22)
π€
Dynamic Duo
(12)
π
Triple Crown
π
Grand Slam
π¬
Deep Specialist
(12)
π§¬
Topic Evolution
π
Keyword Champion
β‘
Prolific Year
(5)
ποΈ
Keyword Collector
(210)
π
Century Club
(45)
π₯
Unstoppable
(8)
π
Conference Pioneer
Conferences
INTERSPEECH (22)
AAAI (7)
NIPS (5)
COLING (2)
EMNLP (2)
ICLR (2)
ACL (1)
ACML (1)
CVPR (1)
ECCV (1)
ICML (1)
MICCAI (1)
NAACL (1)
WACV (1)
Top co-authors
Keywords
speech synthesis
(10)
text-to-speech synthesis
(7)
neural vocoder
(5)
domain adaptation
(3)
contrastive learning
(3)
recurrent neural network transducer
(3)
automatic speech recognition
(3)
autoregressive model
(3)
end-to-end model
(2)
cooperative perception
(2)
transfer learning
(2)
language model
(2)
end-to-end learning
(2)
speech generation
(2)
attention mechanism
(2)
flow matching
(2)
representation learning
(2)
self-supervised learning
(2)
knowledge graph
(2)
object detection
(1)
Papers
Mixture-of-Trees: Learning to Select and Weigh Reasoning Paths for Efficient LLM Inference
AAAI 2026
Griffin: Aerial-Ground Cooperative Detection and Tracking Dataset and Benchmark
AAAI 2026
SparseCoop: Cooperative Perception with Kinematic-Grounded Queries
AAAI 2026
USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation
AAAI 2025
SensorFlow: Sensor and Image Fused Video Stabilization
WACV 2025
Drop the Beat! Freestyler for Accompaniment Conditioned Rapping Voice Generation
AAAI 2025
PodAgent: A Comprehensive Framework for Podcast Generation
ACL 2025
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
ICLR 2024
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
ICML 2024
Masked Residual Diffusion Probabilistic Model with Regional Asymmetry Prior for Generating Perfusion Maps from Multi-phase CTA
MICCAI 2024
Temporal Co-Registration of Simultaneous Electromagnetic Articulography and Electroencephalography for Precise Articulatory and Neural Data Alignment
INTERSPEECH 2024
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
NIPS 2024
PromptTTS 2: Describing and Generating Voices with Text Prompt
ICLR 2024
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
INTERSPEECH 2023
KEPL: Knowledge Enhanced Prompt Learning for Chinese Hypernym-Hyponym Extraction
EMNLP 2023
AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models
NIPS 2023
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
AAAI 2023
Large-Scale Automatic Audiobook Creation
INTERSPEECH 2023
SoftSpeech: Unsupervised Duration Model in FastSpeech 2
INTERSPEECH 2022
Idiosyncratic lingual articulation of American English /Γ¦/ and /Ι/ using network analysis
INTERSPEECH 2022
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge
INTERSPEECH 2022
BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
NIPS 2022
ConCL: Concept Contrastive Learning for Dense Prediction Pre-training in Pathology Images
ECCV 2022
TreeMoCo: Contrastive Neuron Morphology Representation Learning
NIPS 2022
Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
INTERSPEECH 2022
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
INTERSPEECH 2022
DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders
INTERSPEECH 2022
Exploring Forensic Dental Identification with Deep Learning
NIPS 2021
Oral-3D: Reconstructing the 3D Structure of Oral Cavity from Panoramic X-ray
AAAI 2021
KLMo: Knowledge Graph Enhanced Pretrained Language Model with Fine-Grained Relationships
EMNLP 2021
Improving RNN-T for Domain Scaling Using Semi-Supervised Training with Neural TTS
INTERSPEECH 2021
Cross-Speaker Style Transfer with Prosody Bottleneck in Neural Speech Synthesis
INTERSPEECH 2021
An Efficient Subband Linear Prediction for LPCNet-Based Neural Synthesis
INTERSPEECH 2020
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability
INTERSPEECH 2020
Towards Universal Text-to-Speech
INTERSPEECH 2020
Rapid RNN-T Adaptation Using Personalized Speech Synthesis and Neural Language Generator
INTERSPEECH 2020
Atlas-aware ConvNet for Accurate yet Robust Anatomical Segmentation
ACML 2020
Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS
INTERSPEECH 2019
A New GAN-Based End-to-End TTS Training Algorithm
INTERSPEECH 2019
Forward-Backward Decoding for Regularizing End-to-End TTS
INTERSPEECH 2019
Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS
INTERSPEECH 2019
Influences of Fundamental Oscillation on Speaker Identification in Vocalic Utterances by Humans and Computers
INTERSPEECH 2018
A New Glottal Neural Vocoder for Speech Synthesis
INTERSPEECH 2018
Learning Distributed Word Representations For Bidirectional LSTM Recurrent Neural Network
NAACL 2016
Exploring Differential Topic Models for Comparative Summarization of Scientific Papers
COLING 2016
A Praat-Based Algorithm to Extract the Amplitude Envelope and Temporal Fine Structure Using the Hilbert Transform
INTERSPEECH 2016
Abstractive News Summarization based on Event Semantic Link Network
COLING 2016
Preconditioning for Accelerated Iteratively Reweighted Least Squares in Structured Sparsity Reconstruction
CVPR 2014