Srikar Appalaraju

15 papers · 2021–2025 · 8 conferences · across top CS/AI conferences

Achievements

+7 more ↓

🐝 Cross-Pollinator (10) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (8) 🌈 Renaissance Researcher (6)

🌍 Conference Polyglot (8) 🌈 Renaissance Researcher (6) 🤝 Dynamic Duo (10) 🔥 Unstoppable (5) 💎 Century Club (15) 🗃️ Keyword Collector (71) ⚡ Prolific Year (7)

Conferences

NAACL (3) AAAI (2) ACL (2) CVPR (2) EMNLP (2) WACV (2) ECCV (1) ICCV (1)

Top co-authors

R. Manmatha (10) Peng Tang (8) Vijay Mahadevan (5) Kunwar Yashraj Singh (3) Yusheng Xie (3) Shabnam Ghadar (2) Ying Nian Wu (2) Nishant Sankaran (2) Stefano Soatto (2) Tianyang Zhao (2)

Keywords

multimodal learning (4) visual document understanding (3) knowledge distillation (3) encoder-decoder transformer (3) unsupervised pretraining (2) visual question answering (2) multi-modal transformer (2) representation learning (2) document understanding (2) multi-task learning (2) vision-language model (2) text recognition (2) information retrieval (1) object detection (1) question answering (1) cross-lingual transfer (1) zero-shot learning (1) spatial localization (1) information extraction (1) weak supervision (1)

Papers

On the Analysis and Distillation of Emergent Outlier Properties in Pre-trained Language Models NAACL 2025 Turbocharging Web Automation: The Impact of Compressed History States ACL 2025 R-VLM: Region-Aware Vision Language Model for Precise GUI Grounding ACL 2025 VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding ECCV 2024 DocFormerv2: Local Features for Document Understanding AAAI 2024 DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models EMNLP 2024 Multiple-Question Multiple-Answer Text-VQA NAACL 2024 DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder Transformer Models NAACL 2024 No Head Left Behind – Multi-Head Alignment Distillation for Transformers AAAI 2024 Enhancing Vision-Language Pre-training with Rich Supervisions CVPR 2024 A Multi-Modal Multilingual Benchmark for Document Image Classification EMNLP 2023 LaTr: Layout-Aware Transformer for Scene-Text VQA CVPR 2022 SeeTek: Very Large-Scale Open-Set Logo Recognition With Text-Aware Metric Learning WACV 2022 DocFormer: End-to-End Transformer for Document Understanding ICCV 2021 Saliency Driven Perceptual Image Compression WACV 2021