Seong-Whan Lee
40 papers · 2014–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
π Conference Polyglot (11) π Renaissance Researcher (5) π Interdisciplinary Bridge π§ Keyword Pioneer π Academic Marathon (11)
π
Academic Marathon
(11)
π
Cross-Pollinator
(6)
πΊοΈ
Taxonomy Completionist
(79)
π€
Dynamic Duo
(11)
π
Grand Slam
π§¬
Topic Evolution
ποΈ
Keyword Collector
(203)
π
Trend Setter
π
Century Club
(39)
β‘
Prolific Year
(6)
π₯
Unstoppable
(6)
Conferences
AAAI (8)
INTERSPEECH (8)
CVPR (6)
NIPS (5)
WACV (4)
ACL (2)
EMNLP (2)
ICCV (2)
ICLR (1)
ICML (1)
IJCAI (1)
Top co-authors
Research topics
Keywords
self-supervised learning
(5)
zero-shot learning
(4)
generative adversarial network
(4)
speech synthesis
(4)
diffusion model
(3)
temporal action detection
(3)
voice conversion
(3)
feature attribution
(3)
video understanding
(3)
attribution method
(3)
deep neural network
(3)
text-to-speech synthesis
(3)
3d human pose estimation
(2)
representation learning
(2)
generative model
(2)
brain-computer interface
(2)
semantic segmentation
(2)
3d pose estimation
(2)
explainable ai
(2)
human pose estimation
(2)
Papers
ImmersiveTTS: Environment-Aware Text-to-Speech with Multimodal Diffusion Transformer and Domain-Specific Representation Alignment
ACL 2026
XLQA: A Benchmark for Locale-Aware Multilingual Open-Domain Question Answering
EMNLP 2025
ProPose: Probabilistic 3D Human Pose Estimation with Instance-Level Distribution and Normalizing Flow
AAAI 2025
MIRe: Enhancing Multimodal Queries Representation via Fusion-Free Modality Interaction for Multimodal Retrieval
ACL 2025
Comprehensive Information Bottleneck for Unveiling Universal Attribution to Interpret Vision Transformers
CVPR 2025
Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition
CVPR 2025
DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer
CVPR 2025
FillerSpeech: Towards Human-Like Text-to-Speech Synthesis with Filler Insertion and Filler Style Control
EMNLP 2025
PoseAnchor: Robust Root Position Estimation for 3D Human Pose Estimation
ICCV 2025
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation
ICLR 2025
TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression
CVPR 2024
EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech
INTERSPEECH 2024
Text-Infused Attention and Foreground-Aware Modeling for Zero-Shot Temporal Action Detection
NIPS 2024
DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
AAAI 2024
Unknown-Aware Graph Regularization for Robust Semi-supervised Learning from Uncurated Data
AAAI 2024
Toward Approaches to Scalability in 3D Human Pose Estimation
NIPS 2024
Towards Better Visualizing the Decision Basis of Networks via Unfold and Conquer Attribution Guidance
AAAI 2023
Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation
INTERSPEECH 2023
HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer
INTERSPEECH 2023
Pruning-Guided Curriculum Learning for Semi-Supervised Semantic Segmentation
WACV 2023
Action-Aware Masking Network With Group-Based Attention for Temporal Action Localization
WACV 2023
Kinematic-Aware Hierarchical Attention Network for Human Pose Estimation in Videos
WACV 2023
Diff-E: Diffusion-based Learning for Decoding Imagined Speech EEG
INTERSPEECH 2023
Towards Voice Reconstruction from EEG during Imagined Speech
AAAI 2023
Emergence of Hierarchical Layers in a Single Sheet of Self-Organizing Spiking Neurons
NIPS 2022
Evidence of Onset and Sustained Neural Responses to Isolated Phonemes from Intracranial Recordings in a Voice-based Cursor Control Task
INTERSPEECH 2022
HierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis
NIPS 2022
Complete Face Recovery GAN: Unsupervised Joint Face Rotation and De-Occlusion From a Single-View Image
WACV 2022
Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech
INTERSPEECH 2021
Interpreting Deep Neural Networks with Relative Sectional Propagation by Analyzing Comparative Gradients and Hostile Activations
AAAI 2021
VoiceMixer: Adversarial Voice Style Mixup
NIPS 2021
Fre-GAN: Adversarial Frequency-Consistent Audio Synthesis
INTERSPEECH 2021
Uncertainty-Aware Human Mesh Recovery From Video by Learning Part-Based 3D Dynamics
ICCV 2021
Multi-SpectroGAN: High-Diversity and High-Fidelity Spectrogram Generation with Adversarial Style Combination for Speech Synthesis
AAAI 2021
Audio Dequantization for High Fidelity Audio Generation in Flow-Based Neural Vocoder
INTERSPEECH 2020
Uncertainty-Aware Mesh Decoder for High Fidelity 3D Face Reconstruction
CVPR 2020
Relative Attributing Propagation: Interpreting the Comparative Contributions of Individual Units in Deep Neural Networks
AAAI 2020
Deep Reinforcement Learning in Continuous Action Spaces: a Case Study in the Game of Simulated Curling
ICML 2018
Curly: An AI-based Curling Robot Successfully Competing in the Olympic Discipline of Curling
IJCAI 2018
The Role of Context for Object Detection and Semantic Segmentation in the Wild
CVPR 2014