Yuanxing Zhang

20 papers · 2018–2026 · 6 conferences · across top CS/AI conferences

Achievements

+7 more ↓

🏃 Academic Marathon (7) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6) 🐝 Cross-Pollinator (13)

🌈 Renaissance Researcher (6) 🗺️ Taxonomy Completionist (43) 🌍 Conference Polyglot (6) 🧬 Topic Evolution 💎 Century Club (17) 🗃️ Keyword Collector (100) ⚡ Prolific Year (7)

Conferences

ACL (8) EMNLP (3) IJCAI (3) NIPS (3) AAAI (2) ECCV (1)

Top co-authors

Jiaheng Liu (7) Kaigui Bian (5) Bo Zheng (5) Chenchen Zhang (5) Wenbo Su (5) Pengyu Zhao (4) Ge Zhang (4) Fuzheng Zhang (4) Haoran Que (4) Zhiqi Bai (4)

Keywords

multimodal large language model (5) large language model (4) multimodal learning (3) video captioning (2) foundation model (2) video large language model (2) hallucination mitigation (2) domain adaptation (2) vision-language model (2) text-to-video generation (2) benchmark evaluation (2) attention mechanism (1) transfer learning (1) knowledge distillation (1) direct preference optimization (1) image generation (1) preference learning (1) image captioning (1) mathematical reasoning (1) action recognition (1)

Papers

A Multistage Extraction Pipeline for Long Scanned Financial Documents: An Empirical Study in Industrial KYC Workflows ACL 2026 TEMPLE: Incentivizing Temporal Understanding of Video Large Language Models via Progressive Pre-SFT Alignment AAAI 2026 ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding ACL 2026 RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction EMNLP 2025 VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation ACL 2025 Generative Frame Sampler for Long Video Understanding ACL 2025 SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs EMNLP 2025 MIO: A Foundation Model on Multimodal Tokens EMNLP 2025 HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models ACL 2025 Mixture of Decoding: An Attention-Inspired Adaptive Decoding Strategy to Mitigate Hallucinations in Large Vision-Language Models ACL 2025 E2-LLM: Efficient and Extreme Length Extension of Large Language Models ACL 2024 ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models ACL 2024 DDK: Distilling Domain Knowledge for Efficient Large Language Models NIPS 2024 D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models NIPS 2024 GBA: A Tuning-free Approach to Switch between Synchronous and Asynchronous Training for Recommendation Models NIPS 2022 AMEIR: Automatic Behavior Modeling, Interaction Exploration and MLP Investigation in the Recommender System IJCAI 2021 Adversarial Oracular Seq2seq Learning for Sequential Recommendation IJCAI 2020 Differentiable Feature Aggregation Search for Knowledge Distillation ECCV 2020 Spherical Criteria for Fast and Accurate 360° Object Detection AAAI 2020 Towards Reading Comprehension for Long Documents IJCAI 2018