Tae-Hyun Oh

57 papers · 2013–2026 · 13 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🐝 Cross-Pollinator (14) 🌍 Conference Polyglot (12) 🧭 Keyword Pioneer 🏃 Academic Marathon (13) 🌈 Renaissance Researcher (8)

🌈 Renaissance Researcher (8) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (82) 🔬 Deep Specialist (12) 🤝 Dynamic Duo (13) 🏆 Grand Slam 🏆 Keyword Champion (2) 🧬 Topic Evolution 📈 Trend Setter 🚀 Conference Pioneer ⚡ Prolific Year (8) 🔥 Unstoppable (12) 🗃️ Keyword Collector (230) 💎 Century Club (56)

Conferences

CVPR (15) ICCV (12) WACV (7) ECCV (6) ICLR (5) AAAI (3) INTERSPEECH (3) ACL (1) EMNLP (1) ICML (1) IJCNLP (1) NAACL (1) NIPS (1)

Top co-authors

In So Kweon (13) Kim Sung-Bin (10) Nam Hyeon-Woo (6) Moon Ye-Bin (6) Oh Hyun-Bin (5) Junsik Kim (5) Kim Youwang (5) Arda Senocak (5) Suekyeong Nam (4) Kim Jun-Seong (4)

Keywords

audio-visual learning (4) video understanding (4) multimodal learning (3) semantic segmentation (3) zero-shot learning (3) image captioning (3) lip synchronization (3) attention mechanism (3) visual grounding (2) cross-modal learning (2) neural rendering (2) semi-supervised learning (2) self-supervised learning (2) action recognition (2) representation learning (2) image generation (2) variational inference (2) image retrieval (2) sound source localization (2) depth estimation (2)

Papers

Patch-wise Retrieval: A Bag of Practical Techniques for Instance-level Matching WACV 2026 mEOL: Training-Free Instruction-Guided Multimodal Embedder for Vector Graphics and Image Retrieval WACV 2026 Beyond the Highlights: Video Retrieval with Salient and Surrounding Contexts WACV 2026 SMILE-Next: Teaching Large Language Models to Detect, Classify, and Reason about Laughter ACL 2026 AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models ICLR 2025 Zero-shot Depth Completion via Test-time Alignment with Affine-invariant Depth Prior AAAI 2025 JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers ICCV 2025 SoundBrush: Sound as a Brush for Visual Scene Editing AAAI 2025 VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models ICCV 2025 DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding ICCV 2025 VSC: Visual Search Compositional Text-to-Image Diffusion Model ICCV 2025 Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics CVPR 2025 Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration CVPR 2025 Robust 3D Shape Reconstruction in Zero-Shot from a Single Image in the Wild CVPR 2025 BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models ECCV 2024 Noise Map Guidance: Inversion with Spatial Context for Real Image Editing ICLR 2024 CAS: A Probability-Based Approach for Universal Condition Alignment Score ICLR 2024 LaughTalk: Expressive 3D Talking Head Generation With Laughter WACV 2024 SMILE: Multimodal Dataset for Understanding Laughter in Video with Language Models NAACL 2024 Enhancing Speech-Driven 3D Facial Animation with Audio-Visual Guidance from Lip Reading Expert INTERSPEECH 2024 MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset INTERSPEECH 2024 Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering CVPR 2024 FPRF: Feed-Forward Photorealistic Style Transfer of Large-Scale 3D Neural Radiance Fields AAAI 2024 Learning-based Axial Video Motion Magnification ECCV 2024 Sound Source Localization is All about Cross-Modal Alignment ICCV 2023 TextManiA: Enriching Visual Feature by Text-driven Manifold Augmentation ICCV 2023 Learning Few-Shot Segmentation From Bounding Box Annotations WACV 2023 Event-Specific Audio-Visual Fusion Layers: A Simple and New Perspective on Video Understanding WACV 2023 Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment CVPR 2023 Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis INTERSPEECH 2023 DFlow: Learning to Synthesize Better Optical Flow Datasets via a Differentiable Pipeline ICLR 2023 Scratching Visual Transformer's Back with Uniform Attention ICCV 2023 CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes ECCV 2022 Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers ECCV 2022 HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields ECCV 2022 FedPara: Low-rank Hadamard Product for Communication-Efficient Federated Learning ICLR 2022 CDS: Cross-Domain Self-Supervised Pre-Training ICCV 2021 Monocular Reconstruction of Neural Face Reflectance Fields CVPR 2021 Supervoxel Attention Graphs for Long-Range Video Modeling WACV 2021 Distilling Global and Local Logits With Densely Connected Relations ICCV 2021 Listen to Look: Action Recognition by Previewing Audio CVPR 2020 Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach EMNLP 2019 Neural Inverse Knitting: From Images to Manufacturing Instructions ICML 2019 Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach IJCNLP 2019 Speech2Face: Learning the Face Behind a Voice CVPR 2019 Variational Prototyping-Encoder: One-Shot Learning With Prototypical Images CVPR 2019 Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning CVPR 2019 Learning to Localize Sound Source in Visual Scenes CVPR 2018 Learning-based Video Motion Magnification ECCV 2018 Globally Optimal Inlier Set Maximization for Atlanta Frame Estimation CVPR 2018 Weakly- and Self-Supervised Learning for Content-Aware Deep Image Retargeting ICCV 2017 Personalized Cinemagraphs Using Semantic Understanding and Collaborative Learning ICCV 2017 Video-Story Composition via Plot Analysis CVPR 2016 A Pseudo-Bayesian Algorithm for Robust PCA NIPS 2016 Globally Optimal Manhattan Frame Estimation in Real-Time CVPR 2016 Fast Randomized Singular Value Thresholding for Nuclear Norm Minimization CVPR 2015 Partial Sum Minimization of Singular Values in RPCA for Low-Level Vision ICCV 2013