Cha Zhang

15 papers · 2007–2025 · 8 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌍 Conference Polyglot (8) 🏃 Academic Marathon (18) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (12)

🌍 Conference Polyglot (8) 🏃 Academic Marathon (18) 🌈 Renaissance Researcher (6) 🌟 Keyword Trendsetter Combo (4) 🌱 Topic Pioneer 🧬 Topic Evolution 🏆 Keyword Champion 📈 Trend Setter 🚀 Conference Pioneer 🗃️ Keyword Collector (87) 🔥 Unstoppable (5) 💎 Century Club (15)

Conferences

CVPR (6) ACL (3) AAAI (1) AACL (1) EMNLP (1) ICML (1) IJCNLP (1) NIPS (1)

Top co-authors

Dinei Florencio (9) Yijuan Lu (7) Tengchao Lv (6) Lei Cui (6) Furu Wei (5) Guoxin Wang (5) Yiheng Xu (3) Jingye Chen (2) Min Zhang (2) Zhengyuan Yang (2)

Keywords

document understanding (5) multimodal learning (4) transformer architecture (3) model compression (2) convolutional neural network (2) document analysis (2) visual-language modeling (2) optical character recognition (2) neural network optimization (2) filter pruning (2) image restoration (1) multilingual nlp (1) self-supervised learning (1) image captioning (1) domain adaptation (1) video enhancement (1) visual question answering (1) transfer learning (1) network pruning (1) multi-modal learning (1)

Papers

ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding ICML 2025 Unifying Vision, Text, and Layout for Universal Document Processing CVPR 2023 From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding ACL 2023 TrOCR: Transformer-Based Optical Character Recognition with Pre-trained Models AAAI 2023 XDoc: Unified Pre-training for Cross-Format Document Understanding EMNLP 2022 XFUND: A Benchmark Dataset for Multilingual Visually Rich Form Understanding ACL 2022 A Simple yet Effective Learnable Positional Encoding Method for Improving Document Transformer Model AACL 2022 LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding ACL 2021 TAP: Text-Aware Pre-Training for Text-VQA and Text-Caption CVPR 2021 LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding IJCNLP 2021 Towards Efficient Model Compression via Learned Global Ranking CVPR 2020 RePr: Improved Training of Convolutional Filters CVPR 2019 Video Enhancement of People Wearing Polarized Glasses: Darkening Reversal and Reflection Reduction CVPR 2013 Wide-Baseline Hair Capture Using Strand-Based Refinement CVPR 2013 Multiple-Instance Pruning For Learning Efficient Cascade Detectors NIPS 2007