Jianshu Zhang

20 papers · 2016–2026 · 8 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🏃 Academic Marathon (9) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (8) 🐝 Cross-Pollinator (10)

🐝 Cross-Pollinator (10) 🌈 Renaissance Researcher (7) 🗺️ Taxonomy Completionist (48) 🏆 Grand Slam 🧬 Topic Evolution 🏆 Keyword Champion (2) ⚡ Prolific Year (5) 💎 Century Club (17) 🗃️ Keyword Collector (92) 🚀 Conference Pioneer

Conferences

ACL (5) EMNLP (4) AAAI (3) ICCV (2) ICML (2) NIPS (2) ICLR (1) INTERSPEECH (1)

Top co-authors

Jun Du (6) Jipeng Zhang (5) Renjie Pi (5) Jiefeng Ma (4) Pengfei Hu (4) Rui Pan (4) Tong Zhang (4) Zhenrong Zhang (4) Haitao Mi (2) YU HU (2)

Keywords

vision-language model (4) hierarchical structure (3) document analysis (3) in-context learning (2) document parsing (2) multimodal large language model (2) benchmark evaluation (2) mathematical reasoning (1) multimodal learning (1) model calibration (1) conversational ai (1) document understanding (1) speaker verification (1) code generation (1) transfer learning (1) efficient computing (1) text-to-image generation (1) knowledge distillation (1) visual reasoning (1) multi-modal learning (1)

Papers

WebAggregator: Enhancing Compositional Reasoning Capabilities of Deep Research Agent Foundation Models ACL 2026 ProgressLM: Towards Progress Reasoning in Vision-Language Models ACL 2026 EquivPruner: Boosting Efficiency and Quality in LLM-Based Search via Action Pruning ACL 2026 Bridge-Coder: Transferring Model Capabilities from High-Resource to Low-Resource Programming Language ACL 2025 WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback EMNLP 2025 DocMamba: Efficient Document Pre-training with State Space Model AAAI 2025 VLM2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues ACL 2025 MultiVerse: A Multi-Turn Conversation Benchmark for Evaluating Large Vision and Language Models ICCV 2025 Personalized Visual Instruction Tuning ICLR 2025 CAN: Leveraging Clients As Navigators for Generative Replay in Federated Continual Learning ICML 2025 UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition EMNLP 2024 SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding NIPS 2024 FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation EMNLP 2024 MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance EMNLP 2024 Image Textualization: An Automatic Framework for Generating Rich and Detailed Image Descriptions NIPS 2024 HRDoc: Dataset and Baseline Method toward Hierarchical Reconstruction of Document Structures AAAI 2023 TDv2: A Novel Tree-Structured Decoder for Offline Mathematical Expression Recognition AAAI 2022 A Tree-Structured Decoder for Image-to-Markup Generation ICML 2020 Episodic Training for Domain Generalization ICCV 2019 RNN-BLSTM Based Multi-Pitch Estimation INTERSPEECH 2016