Zirui Shao
8 papers · 2023–2025 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+3 more ↓ Show less ↑
π Cross-Pollinator (15) πΊοΈ Taxonomy Completionist (22) π§ Keyword Pioneer π Conference Polyglot (3) π Renaissance Researcher (5)
π
Interdisciplinary Bridge
β‘
Prolific Year
(5)
β
The Questioner
Conferences
EMNLP (5)
CVPR (2)
ECCV (1)
Top co-authors
Keywords
document understanding
(3)
visual information
(2)
large language model
(2)
multimodal learning
(2)
multimodal large language model
(2)
information extraction
(1)
question answering
(1)
document parsing
(1)
instruction tuning
(1)
spatial structure
(1)
hierarchical structure
(1)
low-resource language
(1)
language model
(1)
multi-modal large language model
(1)
positional encoding
(1)
cross-modality learning
(1)
gui automation
(1)
graphical user interface
(1)
knowledge conflict
(1)
fusion gate
(1)
Papers
MP-GUI: Modality Perception with MLLMs for GUI Understanding
CVPR 2025
A Simple yet Effective Layout Token in Large Language Models for Document Understanding
CVPR 2025
BrailleLLM: Braille Instruction Tuning with Large Language Models for Braille Domain Tasks
EMNLP 2025
Is Cognition Consistent with Perception? Assessing and Mitigating Multimodal Knowledge Conflicts in Document Understanding
EMNLP 2025
Intelligent Document Parsing: Towards End-to-end Document Parsing via Decoupled Content Parsing and Layout Grounding
EMNLP 2025
WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation
ECCV 2024
DocHieNet: A Large and Diverse Dataset for Document Hierarchy Parsing
EMNLP 2024
GEM: Gestalt Enhanced Markup Language Model for Web Understanding via Render Tree
EMNLP 2023