Seong-Whan Lee

40 papers · 2014–2026 · 11 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🌍 Conference Polyglot (11) 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🏃 Academic Marathon (11)

🏃 Academic Marathon (11) 🐝 Cross-Pollinator (6) 🗺️ Taxonomy Completionist (79) 🤝 Dynamic Duo (11) 🏆 Grand Slam 🧬 Topic Evolution 🗃️ Keyword Collector (203) 📈 Trend Setter 💎 Century Club (39) ⚡ Prolific Year (6) 🔥 Unstoppable (6)

Conferences

AAAI (8) INTERSPEECH (8) CVPR (6) NIPS (5) WACV (4) ACL (2) EMNLP (2) ICCV (2) ICLR (1) ICML (1) IJCAI (1)

Top co-authors

Sang-Hoon Lee (11) Ho-Joong Kim (6) Gun-Hee Lee (6) Jung-Ho Hong (5) Seung-Bin Kim (4) Ha-Yeong Choi (4) Jun-Hee Kim (3) Ji-Hoon Kim (3) Jaesik Choi (3) Seo-Hyun Lee (3)

Research topics

Cognitive Science (1)

Keywords

self-supervised learning (5) zero-shot learning (4) generative adversarial network (4) speech synthesis (4) diffusion model (3) temporal action detection (3) voice conversion (3) feature attribution (3) video understanding (3) attribution method (3) deep neural network (3) text-to-speech synthesis (3) 3d human pose estimation (2) representation learning (2) generative model (2) brain-computer interface (2) semantic segmentation (2) 3d pose estimation (2) explainable ai (2) human pose estimation (2)

Papers

ImmersiveTTS: Environment-Aware Text-to-Speech with Multimodal Diffusion Transformer and Domain-Specific Representation Alignment ACL 2026 XLQA: A Benchmark for Locale-Aware Multilingual Open-Domain Question Answering EMNLP 2025 ProPose: Probabilistic 3D Human Pose Estimation with Instance-Level Distribution and Normalizing Flow AAAI 2025 MIRe: Enhancing Multimodal Queries Representation via Fusion-Free Modality Interaction for Multimodal Retrieval ACL 2025 Comprehensive Information Bottleneck for Unveiling Universal Attribution to Interpret Vision Transformers CVPR 2025 Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition CVPR 2025 DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer CVPR 2025 FillerSpeech: Towards Human-Like Text-to-Speech Synthesis with Filler Insertion and Filler Style Control EMNLP 2025 PoseAnchor: Robust Root Position Estimation for 3D Human Pose Estimation ICCV 2025 PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation ICLR 2025 TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression CVPR 2024 EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech INTERSPEECH 2024 Text-Infused Attention and Foreground-Aware Modeling for Zero-Shot Temporal Action Detection NIPS 2024 DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion AAAI 2024 Unknown-Aware Graph Regularization for Robust Semi-supervised Learning from Uncurated Data AAAI 2024 Toward Approaches to Scalability in 3D Human Pose Estimation NIPS 2024 Towards Better Visualizing the Decision Basis of Networks via Unfold and Conquer Attribution Guidance AAAI 2023 Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation INTERSPEECH 2023 HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer INTERSPEECH 2023 Pruning-Guided Curriculum Learning for Semi-Supervised Semantic Segmentation WACV 2023 Action-Aware Masking Network With Group-Based Attention for Temporal Action Localization WACV 2023 Kinematic-Aware Hierarchical Attention Network for Human Pose Estimation in Videos WACV 2023 Diff-E: Diffusion-based Learning for Decoding Imagined Speech EEG INTERSPEECH 2023 Towards Voice Reconstruction from EEG during Imagined Speech AAAI 2023 Emergence of Hierarchical Layers in a Single Sheet of Self-Organizing Spiking Neurons NIPS 2022 Evidence of Onset and Sustained Neural Responses to Isolated Phonemes from Intracranial Recordings in a Voice-based Cursor Control Task INTERSPEECH 2022 HierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis NIPS 2022 Complete Face Recovery GAN: Unsupervised Joint Face Rotation and De-Occlusion From a Single-View Image WACV 2022 Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech INTERSPEECH 2021 Interpreting Deep Neural Networks with Relative Sectional Propagation by Analyzing Comparative Gradients and Hostile Activations AAAI 2021 VoiceMixer: Adversarial Voice Style Mixup NIPS 2021 Fre-GAN: Adversarial Frequency-Consistent Audio Synthesis INTERSPEECH 2021 Uncertainty-Aware Human Mesh Recovery From Video by Learning Part-Based 3D Dynamics ICCV 2021 Multi-SpectroGAN: High-Diversity and High-Fidelity Spectrogram Generation with Adversarial Style Combination for Speech Synthesis AAAI 2021 Audio Dequantization for High Fidelity Audio Generation in Flow-Based Neural Vocoder INTERSPEECH 2020 Uncertainty-Aware Mesh Decoder for High Fidelity 3D Face Reconstruction CVPR 2020 Relative Attributing Propagation: Interpreting the Comparative Contributions of Individual Units in Deep Neural Networks AAAI 2020 Deep Reinforcement Learning in Continuous Action Spaces: a Case Study in the Game of Simulated Curling ICML 2018 Curly: An AI-based Curling Robot Successfully Competing in the Olympic Discipline of Curling IJCAI 2018 The Role of Context for Object Detection and Semantic Segmentation in the Wild CVPR 2014