Radu Soricut

54 papers · 2003–2024 · 14 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🌈 Renaissance Researcher (7) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (21) 🌍 Conference Polyglot (14) 🗺️ Taxonomy Completionist (71)

🗺️ Taxonomy Completionist (71) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌱 Topic Pioneer 🏆 Keyword Champion (17) 🔬 Deep Specialist (18) 🤝 Dynamic Duo (14) 👥 Mega-Team (55) 🚀 Conference Pioneer 💎 Century Club (54) 📈 Trend Setter 🗃️ Keyword Collector (178) 🔥 Unstoppable (8) ⚡ Prolific Year (5)

Conferences

EMNLP (11) ACL (10) NAACL (7) CVPR (6) COLING (4) ICLR (3) CONLL (2) ECCV (2) ICCV (2) IJCNLP (2) NIPS (2) AAAI (1) AACL (1) CORL (1)

Top co-authors

Soravit Changpinyo (14) Piyush Sharma (14) Nan Ding (12) Bo Pang (11) Xi Chen (11) Sebastian Goodman (8) Tomer Levinboim (8) Daniel Marcu (6) Zhenhai Zhu (5) Ashish V. Thapliyal (5)

Research topics

Reinforcement Learning (1)

Keywords

image captioning (17) multimodal learning (12) visual question answering (9) transfer learning (4) video understanding (4) transformer architecture (4) vision-language model (3) domain generalization (3) object detection (3) conceptual caption (3) sequence generation (3) instructional video (2) end-to-end learning (2) faster r-cnn (2) reinforcement learning (2) linear complexity (2) natural language generation (2) policy gradient (2) machine translation (2) vision language model (2)

Papers

On Scaling Up a Multilingual Vision and Language Model CVPR 2024 CausalLM is not optimal for in-context learning ICLR 2024 ImageInWords: Unlocking Hyper-Detailed Image Descriptions EMNLP 2024 Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts CVPR 2024 Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting CVPR 2023 PaLI: A Jointly-Scaled Multilingual Language-Image Model ICLR 2023 Connecting Vision and Language With Video Localized Narratives CVPR 2023 RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control CORL 2023 PreSTU: Pre-Training for Scene-Text Understanding ICCV 2023 MaXM: Towards Multilingual Visual Question Answering EMNLP 2023 Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization CVPR 2023 End-to-end Dense Video Captioning as Sequence Generation COLING 2022 Denoising Large-Scale Image Captioning from Alt-text Data Using Content Selection Models COLING 2022 Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset EMNLP 2022 PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks ECCV 2022 All You May Need for VQA are Image Captions NAACL 2022 Understanding Guided Image Captioning Performance across Domains EMNLP 2021 Quality Estimation for Image Captions Based on Large-scale Human Evaluations NAACL 2021 H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences IJCNLP 2021 CrossVQA: Scalably Generating Benchmarks for Systematically Testing VQA Generalization EMNLP 2021 Telling the What While Pointing to the Where: Multimodal Queries for Image Retrieval ICCV 2021 Bridging the Gap Between Practice and PAC-Bayes Theory in Few-Shot Meta-Learning NIPS 2021 H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences ACL 2021 Understanding Guided Image Captioning Performance across Domains CONLL 2021 Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts CVPR 2021 COSMic: A Coherence-Aware Generation Metric for Image Descriptions EMNLP 2021 Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance EMNLP 2020 Reinforcing an Image Caption Generator Using Off-Line Human Feedback AAAI 2020 Multimodal Pretraining for Dense Video Captioning AACL 2020 Cross-modal Language Generation using Pivot Stabilization for Web-scale Language Coverage ACL 2020 Cross-modal Coherence Modeling for Caption Generation ACL 2020 Connecting Vision and Language with Localized Narratives ECCV 2020 TeaForN: Teacher-Forcing with N-grams EMNLP 2020 Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube EMNLP 2020 ALBERT: A Lite BERT for Self-supervised Learning of Language Representations ICLR 2020 Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering EMNLP 2019 Informative Image Captioning with External Sources of Information ACL 2019 A Case Study on Combining ASR and Visual Features for Generating Instructional Video Captions CONLL 2019 Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering IJCNLP 2019 Points, Paths, and Playscapes: Large-scale Spatial Language Understanding Tasks Set in the Real World NAACL 2018 SHAPED: Shared-Private Encoder-Decoder for Text Style Adaptation NAACL 2018 Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning ACL 2018 Cold-Start Reinforcement Learning with Softmax Policy Gradient NIPS 2017 Unsupervised Morphology Induction Using Word Embeddings NAACL 2015 TrustRank: Inducing Trust in Automatic Translations via Ranking ACL 2010 Automatic Prediction of Parser Accuracy EMNLP 2008 Discourse Generation Using Utility-Trained Coherence Models ACL 2006 Stochastic Language Generation Using WIDL-Expressions and its Application in Machine Translation and Summarization COLING 2006 Discourse Generation Using Utility-Trained Coherence Models COLING 2006 Stochastic Language Generation Using WIDL-Expressions and its Application in Machine Translation and Summarization ACL 2006 Towards Developing Generation Algorithms for Text-to-Text Applications ACL 2005 Automatic Question Answering: Beyond the Factoid NAACL 2004 A Unified Framework For Automatic Evaluation Using 4-Gram Co-occurrence Statistics ACL 2004 Sentence Level Discourse Parsing using Syntactic and Lexical Information NAACL 2003