Jae Sung Park
16 papers · 2017–2025 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
🐝 Cross-Pollinator (8) 🧭 Keyword Pioneer 🏃 Academic Marathon (8) 🌍 Conference Polyglot (7) 🌈 Renaissance Researcher (7)
🏃
Academic Marathon
(8)
🌉
Interdisciplinary Bridge
🐣
Hot Topic Early Bird
👥
Mega-Team
(50)
🤝
Dynamic Duo
(10)
🔥
Unstoppable
(7)
💎
Century Club
(16)
🗃️
Keyword Collector
(76)
📈
Trend Setter
Conferences
NIPS (5)
CVPR (4)
ECCV (3)
EMNLP (1)
ICLR (1)
NAACL (1)
RSS (1)
Top co-authors
Keywords
visual question answering
(3)
visual commonsense
(2)
vision-language model
(2)
multimodal learning
(2)
self-supervised learning
(2)
natural language generation
(2)
visual reasoning
(2)
image retrieval
(1)
action recognition
(1)
reinforcement learning
(1)
knowledge distillation
(1)
temporal reasoning
(1)
video captioning
(1)
text generation
(1)
adversarial learning
(1)
video understanding
(1)
multi-modal learning
(1)
model robustness
(1)
video retrieval
(1)
image captioning
(1)
Papers
CertainlyUncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
ICLR 2025
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
CVPR 2025
Synthetic Visual Genome
CVPR 2025
Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
NIPS 2024
ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition
NIPS 2024
Fusing Pre-Trained Language Models With Multimodal Prompts Through Reinforcement Learning
CVPR 2023
Localized Symbolic Knowledge Distillation for Visual Commonsense Models
NIPS 2023
Exposing the Limits of Video-Text Models through Contrast Sets
NAACL 2022
The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning
ECCV 2022
MERLOT: Multimodal Neural Script Knowledge Models
NIPS 2021
LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes
NIPS 2021
Identity-Aware Multi-Sentence Video Description
ECCV 2020
Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs
EMNLP 2020
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
ECCV 2020
Adversarial Inference for Multi-Sentence Video Description
CVPR 2019
Intention-Aware Motion Planning Using Learning Based Human Motion Prediction
RSS 2017