Minsu Kim
52 papers · 2020–2025 · 14 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (14) π Interdisciplinary Bridge π Renaissance Researcher (5) π Academic Marathon (5)
π
Interdisciplinary Bridge
π
Cross-Pollinator
(4)
π§
Keyword Pioneer
π
Grand Slam
π
Triple Crown
π€
Dynamic Duo
(16)
π§¬
Topic Evolution
π₯
Unstoppable
(6)
π
Trend Setter
β
The Questioner
β‘
Prolific Year
(18)
ποΈ
Keyword Collector
(210)
π
Century Club
(52)
Conferences
NIPS (9)
AAAI (6)
CVPR (6)
ICLR (6)
ICML (6)
ICCV (5)
ECCV (3)
WACV (3)
ACL (2)
INTERSPEECH (2)
AISTATS (1)
EMNLP (1)
IJCAI (1)
NAACL (1)
Top co-authors
Keywords
lip reading
(6)
large language model
(4)
visual speech recognition
(3)
audio-visual speech recognition
(3)
transformer architecture
(3)
combinatorial optimization
(3)
diffusion model
(3)
memory network
(3)
multimodal learning
(3)
generative flow network
(2)
convolutional neural network
(2)
deep reinforcement learning
(2)
traveling salesman problem
(2)
language model
(2)
audio-visual speech
(2)
visual context
(2)
generative model
(2)
reinforcement learning
(2)
domain adaptation
(2)
variational inference
(2)
Papers
Improved Off-policy Reinforcement Learning in Biological Sequence Design
ICML 2025
Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations
ICCV 2025
ExploreGS: Explorable 3D Scene Reconstruction with Virtual Camera Samplings and Diffusion Priors
ICCV 2025
Generative Flows on Synthetic Pathway for Drug Design
ICLR 2025
Mixture of Submodules for Domain Adaptive Person Search
CVPR 2025
T-CIL: Temperature Scaling using Adversarial Perturbation for Calibration in Class-Incremental Learning
CVPR 2025
From Evidence to Belief: A Bayesian Epistemology Approach to Language Models
NAACL 2025
Adaptive teachers for amortized samplers
ICLR 2025
Outsourced Diffusion Sampling: Efficient Posterior Inference in Latent Spaces of Generative Models
ICML 2025
Ant Colony Sampling with GFlowNets for Combinatorial Optimization
AISTATS 2025
Learning Diverse Attacks on Large Language Models for Robust Red-Teaming and Safety Tuning
ICLR 2025
MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks
ICLR 2025
Implicit Neural Image Stitching With Enhanced and Blended Feature Reconstruction
WACV 2024
Genetic-guided GFlowNets for Sample Efficient Molecular Optimization
NIPS 2024
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
CVPR 2024
Symmetric Replay Training: Enhancing Sample Efficiency in Deep Reinforcement Learning for Combinatorial Optimization
ICML 2024
Hierarchically Structured Neural Bones for Reconstructing Animatable Objects from Casual Videos
ECCV 2024
Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing
EMNLP 2024
Letβs Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation
ACL 2024
Learning Energy Decompositions for Partial Inference in GFlowNets
ICLR 2024
Local Search GFlowNets
ICLR 2024
Improved off-policy training of diffusion samplers
NIPS 2024
SpaFL: Communication-Efficient Federated Learning With Sparse Models And Low Computational Overhead
NIPS 2024
Pessimistic Backward Policy for GFlowNets
NIPS 2024
Learning Residual Elastic Warps for Image Stitching Under Dirichlet Boundary Condition
WACV 2024
Equity-Transformer: Solving NP-Hard Min-Max Routing Problems as Sequential Generation with Equity Context
AAAI 2024
Quilt: Robust Data Segment Selection against Concept Drifts
AAAI 2024
Amortizing intractable inference in diffusion models for vision, language, and control
NIPS 2024
Epistemology of Language Models: Do Language Models Have Holistic Knowledge?
ACL 2024
Learning to Scale Logits for Temperature-Conditional GFlowNets
ICML 2024
DevFormer: A Symmetric Transformer for Context-Aware Device Placement
ICML 2023
Bootstrapped Training of Score-Conditioned Generator for Offline Design of Biological Sequences
NIPS 2023
Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video
AAAI 2023
Watch or Listen: Robust Audio-Visual Speech Recognition With Visual Corruption Modeling and Reliability Scoring
CVPR 2023
PartMix: Regularization Strategy To Learn Part Discovery for Visible-Infrared Person Re-Identification
CVPR 2023
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
ICCV 2023
Meta-SAGE: Scale Meta-Learning Scheduled Adaptation with Guided Exploration for Mitigating Scale Shift on Combinatorial Optimization
ICML 2023
Intelligible Lip-to-Speech Synthesis with Speech Units
INTERSPEECH 2023
Normality Guided Multiple Instance Learning for Weakly Supervised Video Anomaly Detection
WACV 2023
Speaker-Adaptive Lip Reading with User-Dependent Padding
ECCV 2022
VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection
ECCV 2022
SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory
AAAI 2022
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading
AAAI 2022
CERT: Continual Pre-training on Sketches for Library-oriented Code Generation
IJCAI 2022
Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition
INTERSPEECH 2022
Sym-NCO: Leveraging Symmetricity for Neural Combinatorial Optimization
NIPS 2022
Learning Collaborative Policies to Solve NP-hard Routing Problems
NIPS 2021
Multi-Modality Associative Bridging Through Memory: Speech Sound Recollected From Face Video
ICCV 2021
Lip to Speech Synthesis with Visual Context Attentional GAN
NIPS 2021
Cross-Domain Grouping and Alignment for Domain Adaptive Semantic Segmentation
AAAI 2021
Learning Canonical 3D Object Representation for Fine-Grained Recognition
ICCV 2021
Cylindrical Convolutional Networks for Joint Object Detection and Viewpoint Estimation
CVPR 2020