Gunhee Kim

104 papers · 2009–2026 · 11 conferences · across top CS/AI conferences

Achievements

+18 more ↓

🗺️ Taxonomy Completionist (22) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7) 🐣 Hot Topic Early Bird

🌈 Renaissance Researcher (7) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (4) 🏆 Keyword Champion 🧬 Topic Evolution 🏆 Grand Slam 👑 Triple Crown 🌱 Topic Pioneer 🔬 Deep Specialist (14) 🤝 Dynamic Duo (16) 🔥 Unstoppable (9) ❓ The Questioner (8) 🚀 Conference Pioneer 💎 Century Club (102) 📈 Trend Setter ⚡ Prolific Year (17) 🗃️ Keyword Collector (382)

Conferences

CVPR (18) ICLR (15) ACL (12) EMNLP (11) ICCV (11) NIPS (10) ECCV (8) ICML (8) NAACL (6) AAAI (4) AISTATS (1)

Top co-authors

Youngjae Yu (16) Heeseung Yun (11) Jaewoo Ahn (9) Sangho Lee (8) Hyunwoo Kim (8) Byeongchang Kim (7) Soochan Lee (7) Jaekyeom Kim (6) Jongseok Kim (5) Yejin Choi (5)

Keywords

large language model (9) multimodal learning (8) image captioning (6) visual question answering (5) dialogue system (4) self-supervised learning (4) data augmentation (4) question answering (3) text generation (3) information bottleneck (3) reinforcement learning (3) object detection (3) unsupervised learning (3) variational inference (3) mutual information (3) benchmark evaluation (3) transfer learning (3) natural language generation (2) federated learning (2) continual learning (2)

Papers

MAVIS: A Benchmark for Multimodal Source Attribution in Long-form Visual Question Answering AAAI 2026 Gaussian Blending: Rethinking Alpha Blending in 3D Gaussian Splatting AAAI 2026 FedMeNF: Privacy-Preserving Federated Meta-Learning for Neural Fields ICCV 2025 HalLoc: Token-level Localization of Hallucinations for Vision Language Models CVPR 2025 Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates ACL 2025 LPOI: Listwise Preference Optimization for Vision Language Models ACL 2025 When Should Dense Retrievers Be Updated in Evolving Corpora? Detecting Out-of-Distribution Corpora Using GradNormIR ACL 2025 ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams CVPR 2025 FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games EMNLP 2025 Think, Verbalize, then Speak: Bridging Complex Thoughts and Comprehensible Speech EMNLP 2025 Meta-Continual Learning of Neural Fields ICLR 2025 ViSAGe: Video-to-Spatial Audio Generation ICLR 2025 Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning ICLR 2025 How to Move Your Dragon: Text-to-Motion Synthesis for Large-Vocabulary Objects ICML 2025 Is a Peeled Apple Still Red? Evaluating LLMs’ Ability for Conceptual Combination with Property Type NAACL 2025 Behavior-SD: Behaviorally Aware Spoken Dialogue Generation with Large Language Models NAACL 2025 ChartCap: Mitigating Hallucination of Dense Chart Captioning ICCV 2025 Spherical World-Locking for Audio-Visual Localization in Egocentric Videos ECCV 2024 FedAvP: Augment Local Data via Shared Policy in Federated Learning NIPS 2024 Sample Selection via Contrastive Fragmentation for Noisy Label Regression NIPS 2024 GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge? ACL 2024 Who Wrote this Code? Watermarking for Code Generation ACL 2024 TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models ACL 2024 See It All: Contextualized Late Aggregation for 3D Dense Captioning ACL 2024 Learning to Continually Learn with the Bayesian Principle ICML 2024 Compositional Conservatism: A Transductive Approach in Offline Reinforcement Learning ICLR 2024 ESR-NeRF: Emissive Source Reconstruction Using LDR Multi-view Images CVPR 2024 Bi-directional Contextual Attention for 3D Dense Captioning ECCV 2024 Text2Chart31: Instruction Tuning for Chart Generation with Automatic Feedback EMNLP 2024 DynamicER: Resolving Emerging Mentions to Dynamic Entities for RAG EMNLP 2024 MPCHAT: Towards Multimodal Persona-Grounded Conversation ACL 2023 Benchmark of Machine Learning Force Fields for Semiconductor Simulations: Datasets, Metrics, and Comparative Analysis NIPS 2023 Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with Language Models ACL 2023 KoSBI: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Applications ACL 2023 Fusing Pre-Trained Language Models With Multimodal Prompts Through Reinforcement Learning CVPR 2023 SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created through Human-Machine Collaboration ACL 2023 Federated Learning via Meta-Variational Dropout NIPS 2023 Recasting Continual Learning as Sequence Modeling NIPS 2023 Can Language Models Laugh at YouTube Short-form Videos? EMNLP 2023 mRedditSum: A Multimodal Abstractive Summarization Dataset of Reddit Threads with Images EMNLP 2023 SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization EMNLP 2023 FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions EMNLP 2023 Dense 2D-3D Indoor Prediction with Sound via Aligned Cross-Modal Distillation ICCV 2023 EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale Visual Localization ICCV 2023 Neural Variational Dropout Processes ICLR 2022 ProsocialDialog: A Prosocial Backbone for Conversational Agents EMNLP 2022 Lipschitz-constrained Unsupervised Skill Discovery ICLR 2022 Constrained GPI for Zero-Shot Transfer in Reinforcement Learning NIPS 2022 Panoramic Vision Transformer for Saliency Detection in 360° Videos ECCV 2022 On Convergence of Lookahead in Smooth Games AISTATS 2022 ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning ICCV 2021 Unsupervised Skill Discovery with Bottleneck Option Learning ICML 2021 IB-GAN: Disentangled Representation Learning with Information Bottleneck Generative Adversarial Networks AAAI 2021 Dual Compositional Learning in Interactive Image Retrieval AAAI 2021 SEDONA: Search for Decoupled Neural Networks toward Greedy Block-wise Learning ICLR 2021 Transitional Adaptation of Pretrained Models for Visual Storytelling CVPR 2021 StyleMix: Separating Content and Style for Enhanced Data Augmentation CVPR 2021 Self-Supervised Learning of Compressed Video Representations ICLR 2021 Drop-Bottleneck: Learning Discrete Compressed Representation for Noise-Robust Exploration ICLR 2021 Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods NIPS 2021 Parameter Efficient Multimodal Transformers for Video Representation Learning ICLR 2021 Pano-AVQA: Grounded Audio-Visual Question Answering on 360deg Videos ICCV 2021 Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes EMNLP 2021 Continual Learning on Noisy Data Streams via Self-Purified Replay ICCV 2021 Viewpoint-Agnostic Change Captioning With Cycle Consistency ICCV 2021 How Robust are Fact Checking Systems on Colloquial Claims? NAACL 2021 Unsupervised Representation Learning via Neural Activation Coding ICML 2021 A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning ICLR 2020 Augmenting Data for Sarcasm Detection with Unlabeled Conversation Context ACL 2020 Model-Agnostic Boundary-Adversarial Sampling for Test-Time Generalization in Few-Shot learning ECCV 2020 Character Grounding and Re-Identification in Story of Videos and Text Descriptions ECCV 2020 Imbalanced Continual Learning with Partitioning Reservoir Sampling ECCV 2020 Rethinking Class Activation Mapping for Weakly Supervised Object Localization ECCV 2020 Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness EMNLP 2020 Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue ICLR 2020 Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation ICLR 2019 Better to Follow, Follow to Be Better: Towards Precise Supervision of Feature Super-Resolution for Small Object Detection ICCV 2019 Multi-Task Self-Supervised Object Detection via Recycling of Bounding Box Annotations CVPR 2019 Curiosity-Bottleneck: Exploration By Distilling Task-Specific Novelty ICML 2019 Variational Laplace Autoencoders ICML 2019 AudioCaps: Generating Captions for Audios in The Wild NAACL 2019 Abstractive Summarization of Reddit Posts with Multi-level Memory Networks NAACL 2019 Self-Routing Capsule Networks NIPS 2019 Discovery of Natural Language Concepts in Individual Units of CNNs ICLR 2019 Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors CVPR 2018 A Hierarchical Latent Structure for Variational Conversation Modeling NAACL 2018 Memorization Precedes Generation: Learning Unsupervised GANs with Memory Networks ICLR 2018 Video Prediction with Appearance and Motion Conditions ICML 2018 A Joint Sequence Fusion Model for Video Question Answering and Retrieval ECCV 2018 A Memory Network Approach for Story-Based Temporal Summarization of 360° Videos CVPR 2018 End-To-End Concept Word Detection for Video Captioning, Retrieval, and Question Answering CVPR 2017 A Read-Write Memory Network for Movie Story Understanding ICCV 2017 Attend to You: Personalized Image Captioning With Context Sequence Memory Networks CVPR 2017 TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering CVPR 2017 SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization ICML 2017 Supervising Neural Attention Models for Video Captioning by Human Gaze Data CVPR 2017 Joint Photo Stream and Blog Post Summarization and Exploration CVPR 2015 Ranking and Retrieval of Image Sequences From Multiple Paragraph Queries CVPR 2015 Expressing an Image Stream with a Sequence of Natural Sentences NIPS 2015 Storyline Representation of Egocentric Videos With an Applications to Story-Based Search ICCV 2015 Joint Summarization of Large-scale Collections of Web Images and Videos for Storyline Reconstruction CVPR 2014 Reconstructing Storyline Graphs for Image Recommendation from Web Community Photos CVPR 2014 Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines CVPR 2013 Unsupervised Detection of Regions of Interest Using Iterative Link Analysis NIPS 2009