Chen Duan
5 papers · 2024–2025 · 4 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+1 more ↓ Show less ↑
π Conference Polyglot (4) π Renaissance Researcher (5) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (20) π§ Keyword Pioneer
π
Cross-Pollinator
(15)
Conferences
CVPR (2)
AAAI (1)
ACL (1)
ICCV (1)
Top co-authors
Keywords
document understanding
(3)
multimodal large language model
(2)
multi-modal large language model
(2)
visual question answering
(2)
vision language model
(1)
vision-language model
(1)
multimodal representation
(1)
visual foundation model
(1)
visual-language alignment
(1)
mask generation
(1)
token-level prediction
(1)
visual language model
(1)
optical character recognition
(1)
visual-text alignment
(1)
text attribute
(1)
text recognition
(1)
text-image alignment
(1)
text spotting
(1)
scene text spotting
(1)
visual language alignment
(1)
Papers
InstructOCR: Instruction Boosting Scene Text Spotting
AAAI 2025
Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review
ACL 2025
Marten: Visual Question Answering with Mask Generation for Multi-modal Document Understanding
CVPR 2025
A Token-level Text Image Foundation Model for Document Understanding
ICCV 2025
ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting
CVPR 2024