R. Manmatha
21 papers · 2016–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
π Academic Marathon (9) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (8) π Cross-Pollinator (10)
π
Cross-Pollinator
(10)
π
Renaissance Researcher
(6)
πΊοΈ
Taxonomy Completionist
(46)
π§¬
Topic Evolution
π€
Dynamic Duo
(10)
π
Keyword Champion
(2)
π
Trend Setter
π
Century Club
(21)
π₯
Unstoppable
(6)
β‘
Prolific Year
(7)
ποΈ
Keyword Collector
(96)
Conferences
CVPR (9)
ICCV (3)
AAAI (2)
ECCV (2)
NAACL (2)
ACL (1)
EMNLP (1)
WACV (1)
Top co-authors
Keywords
visual document understanding
(3)
encoder-decoder transformer
(3)
text recognition
(3)
multimodal learning
(3)
knowledge distillation
(2)
attention mechanism
(2)
visual question answering
(2)
semantic segmentation
(2)
multi-modal transformer
(2)
scene text recognition
(2)
unsupervised pretraining
(2)
multi-task learning
(1)
object detection
(1)
image segmentation
(1)
image classification
(1)
document understanding
(1)
hierarchical classification
(1)
named entity recognition
(1)
image generation
(1)
question answering
(1)
Papers
R-VLM: Region-Aware Vision Language Model for Precise GUI Grounding
ACL 2025
Scaling up Image Segmentation across Data and Tasks
CVPR 2025
DocFormerv2: Local Features for Document Understanding
AAAI 2024
No Head Left Behind β Multi-Head Alignment Distillation for Transformers
AAAI 2024
On the Scalability of Diffusion-based Text-to-Image Generation
CVPR 2024
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
ECCV 2024
DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models
EMNLP 2024
Multiple-Question Multiple-Answer Text-VQA
NAACL 2024
DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder Transformer Models
NAACL 2024
DocTr: Document Transformer for Structured Information Extraction in Documents
ICCV 2023
PolyFormer: Referring Image Segmentation As Sequential Polygon Generation
CVPR 2023
Towards Weakly-Supervised Text Spotting Using a Multi-Task Transformer
CVPR 2022
GLASS: Global to Local Attention for Scene-Text Spotting
ECCV 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA
CVPR 2022
Saliency Driven Perceptual Image Compression
WACV 2021
DocFormer: End-to-End Transformer for Document Understanding
ICCV 2021
Sequence-to-Sequence Contrastive Learning for Text Recognition
CVPR 2021
SCATTER: Selective Context Attentional Scene Text Recognizer
CVPR 2020
Compressed Video Action Recognition
CVPR 2018
Sampling Matters in Deep Embedding Learning
ICCV 2017
Deep Decision Network for Multi-Class Image Classification
CVPR 2016