Jae Sung Park

16 papers · 2017–2025 · 7 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🐝 Cross-Pollinator (8) 🧭 Keyword Pioneer 🏃 Academic Marathon (8) 🌍 Conference Polyglot (7) 🌈 Renaissance Researcher (7)

🏃 Academic Marathon (8) 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 👥 Mega-Team (50) 🤝 Dynamic Duo (10) 🔥 Unstoppable (7) 💎 Century Club (16) 🗃️ Keyword Collector (76) 📈 Trend Setter

Conferences

NIPS (5) CVPR (4) ECCV (3) EMNLP (1) ICLR (1) NAACL (1) RSS (1)

Top co-authors

Yejin Choi (10) Ali Farhadi (9) Ximing Lu (5) Jack Hessel (5) Ranjay Krishna (4) Anna Rohrbach (4) Khyathi Chandu (3) Aditya Kusupati (3) Chandra Bhagavatula (3) Trevor Darrell (3)

Keywords

visual question answering (3) visual commonsense (2) vision-language model (2) multimodal learning (2) self-supervised learning (2) natural language generation (2) visual reasoning (2) image retrieval (1) action recognition (1) reinforcement learning (1) knowledge distillation (1) temporal reasoning (1) video captioning (1) text generation (1) adversarial learning (1) video understanding (1) multi-modal learning (1) model robustness (1) video retrieval (1) image captioning (1)

Papers

CertainlyUncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness ICLR 2025 Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models CVPR 2025 Synthetic Visual Genome CVPR 2025 Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass NIPS 2024 ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition NIPS 2024 Fusing Pre-Trained Language Models With Multimodal Prompts Through Reinforcement Learning CVPR 2023 Localized Symbolic Knowledge Distillation for Visual Commonsense Models NIPS 2023 Exposing the Limits of Video-Text Models through Contrast Sets NAACL 2022 The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning ECCV 2022 MERLOT: Multimodal Neural Script Knowledge Models NIPS 2021 LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes NIPS 2021 Identity-Aware Multi-Sentence Video Description ECCV 2020 Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs EMNLP 2020 VisualCOMET: Reasoning about the Dynamic Context of a Still Image ECCV 2020 Adversarial Inference for Multi-Sentence Video Description CVPR 2019 Intention-Aware Motion Planning Using Learning Based Human Motion Prediction RSS 2017