Sai Rajeswar

17 papers · 2021–2026 · 9 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🐝 Cross-Pollinator (11) 🌍 Conference Polyglot (7) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (5)

🌈 Renaissance Researcher (7) 🌍 Conference Polyglot (7) 🏆 Grand Slam 🧬 Topic Evolution 🤝 Dynamic Duo (10) 👥 Mega-Team (39) 🗃️ Keyword Collector (59) 📈 Trend Setter 💎 Century Club (15) 🔥 Unstoppable (5) ⚡ Prolific Year (8)

Conferences

ICLR (5) CVPR (2) EMNLP (2) ICML (2) NIPS (2) AAAI (1) ACL (1) CORL (1) EACL (1)

Top co-authors

David Vázquez (10) Christopher Pal (8) Juan A. Rodriguez (7) Perouz Taslakian (7) Spandana Gella (5) Aaron Courville (4) Abhay Puri (4) Issam H. Laradji (4) Nicolas Chapados (4) Marco Pedersoli (3)

Keywords

multimodal learning (3) code generation (3) large language model (2) vision-language model (2) image vectorization (2) multimodal large language model (2) multi-label learning (1) transfer learning (1) contrastive learning (1) language modeling (1) autoregressive generation (1) document understanding (1) language model evaluation (1) benchmark evaluation (1) document retrieval (1) question answering (1) policy learning (1) task generalization (1) model predictive control (1) cross-modal learning (1)

Papers

StarFlow: Generating Structured Workflow Outputs From Sketch Images EACL 2026 Grammar Search for Multi-Agent Systems ACL 2026 BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks ICLR 2025 StarVector: Generating Scalable Vector Graphics Code from Images and Text AAAI 2025 InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation ICLR 2025 WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation EMNLP 2025 ColMate: Contrastive Late Interaction and Masked Text for Multimodal Document Retrieval EMNLP 2025 UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction ICML 2025 StarVector: Generating Scalable Vector Graphics Code from Images and Text CVPR 2025 VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text ICLR 2025 RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content NIPS 2024 GenRL: Multimodal-foundation world models for generalization in embodied agents NIPS 2024 Efficient Dynamics Modeling in Interactive Environments with Koopman Theory ICLR 2024 Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels ICML 2023 Choreographer: Learning and Adapting Skills in Imagination ICLR 2023 Multi-Label Iterated Learning for Image Classification With Label Ambiguity CVPR 2022 Haptics-based Curiosity for Sparse-reward Tasks CORL 2021