Soravit Changpinyo

24 papers · 2013–2024 · 9 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🐝 Cross-Pollinator (14) 🏃 Academic Marathon (11) 🧭 Keyword Pioneer 🌍 Conference Polyglot (9) 🌈 Renaissance Researcher (5)

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (51) 👥 Mega-Team (43) 🧬 Topic Evolution 🤝 Dynamic Duo (14) 🔬 Deep Specialist (10) ⚡ Prolific Year (6) 🔥 Unstoppable (9) 💎 Century Club (24) 🗃️ Keyword Collector (105) 📈 Trend Setter ❓ The Questioner (2)

Conferences

CVPR (5) EMNLP (4) ICCV (4) NIPS (4) COLING (2) ECCV (2) ICLR (1) IJCNLP (1) NAACL (1)

Top co-authors

Radu Soricut (14) Hexiang Hu (6) Xi Chen (6) Piyush Sharma (5) Wei-Lun Chao (5) Nan Ding (4) Fei Sha (4) Boqing Gong (4) Vittorio Ferrari (3) Idan Szpektor (3)

Keywords

visual question answering (12) image captioning (8) multimodal learning (6) object detection (5) vision language (2) transfer learning (2) vision language model (2) long-tailed distribution (2) faster r-cnn (2) zero-shot learning (2) manifold learning (1) model calibration (1) domain generalization (1) text-to-image synthesis (1) metric learning (1) similarity learning (1) video understanding (1) image retrieval (1) few-shot learning (1) expectation maximization (1)

Papers

On Scaling Up a Multilingual Vision and Language Model CVPR 2024 MetaCLUE: Towards Comprehensive Visual Metaphors Research CVPR 2023 PreSTU: Pre-Training for Scene-Text Understanding ICCV 2023 What You See is What You Read? Improving Text-Image Alignment Evaluation NIPS 2023 Connecting Vision and Language With Video Localized Narratives CVPR 2023 PaLI: A Jointly-Scaled Multilingual Language-Image Model ICLR 2023 Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions? EMNLP 2023 MaXM: Towards Multilingual Visual Question Answering EMNLP 2023 All You May Need for VQA are Image Captions NAACL 2022 Denoising Large-Scale Image Captioning from Alt-text Data Using Content Selection Models COLING 2022 PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks ECCV 2022 Telling the What While Pointing to the Where: Multimodal Queries for Image Retrieval ICCV 2021 Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts CVPR 2021 On Model Calibration for Long-Tailed Object Detection and Instance Segmentation NIPS 2021 CrossVQA: Scalably Generating Benchmarks for Systematically Testing VQA Generalization EMNLP 2021 Robust Visual Reasoning via Language Guided Neural Module Networks NIPS 2021 MosaicOS: A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection ICCV 2021 Connecting Vision and Language with Localized Narratives ECCV 2020 Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering EMNLP 2019 Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering IJCNLP 2019 Multi-Task Learning for Sequence Tagging: An Empirical Study COLING 2018 Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning ICCV 2017 Synthesized Classifiers for Zero-Shot Learning CVPR 2016 Similarity Component Analysis NIPS 2013