Kaifu Zhang
18 papers · 2024–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+5 more ↓ Show less ↑
π Cross-Pollinator (12) πΊοΈ Taxonomy Completionist (35) π Interdisciplinary Bridge π§ Keyword Pioneer π Renaissance Researcher (6)
π
Conference Polyglot
(7)
π€
Dynamic Duo
(13)
π
Century Club
(14)
β‘
Prolific Year
(12)
ποΈ
Keyword Collector
(75)
Conferences
ACL (8)
ICCV (2)
ICML (2)
NIPS (2)
AAAI (1)
CVPR (1)
EMNLP (1)
NAACL (1)
Top co-authors
Keywords
large language model
(6)
multimodal large language model
(4)
reinforcement learning
(2)
supervised fine-tuning
(2)
knowledge distillation
(2)
direct preference optimization
(2)
agent system
(2)
preference learning
(1)
visual question answering
(1)
catastrophic forgetting
(1)
machine translation
(1)
preference alignment
(1)
cross-lingual transfer
(1)
chain-of-thought reasoning
(1)
text-to-image synthesis
(1)
image generation
(1)
image synthesis
(1)
language model evaluation
(1)
text-to-image generation
(1)
document understanding
(1)
Papers
MirrorCAPTCHA: Wild CAPTCHA, Wild Distribution, Wild Web-based Platform Meet Multimodal LLM Agents
ACL 2026
USB: A COMPREHENSIVE AND UNIFIED SAFETY EVALUATION BENCHMARK FOR MULTIMODAL LARGE LANGUAGE MODELS
ACL 2026
Finding the Translation Switch: Discovering and Exploiting the Task-Initiation Features in LLMs
AAAI 2026
Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding
ACL 2026
Marco-Bench-MIF: On Multilingual Instruction-Following Capability of Large Language
ACL 2025
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development
ACL 2025
UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation
CVPR 2025
Marco Large Translation Model at WMT2025: Transforming Translation Capability in LLMs via Quality-Aware Training and Decoding
EMNLP 2025
TeEFusion: Blending Text Embeddings to Distill Classifier-Free Guidance
ICCV 2025
CHATS: Combining Human-Aligned Optimization and Test-Time Sampling for Text-to-Image Generation
ICML 2025
Parrot: Multilingual Visual Instruction Tuning
ICML 2025
LayAlign: Enhancing Multilingual Reasoning in Large Language Models via Layer-Wise Adaptive Fusion and Alignment Strategy
NAACL 2025
MDP3: A Training-free Approach for List-wise Frame Selection in Video-LLMs
ICCV 2025
A Unified Agentic Framework for Evaluating Conditional Image Generation
ACL 2025
Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models
ACL 2025
Marco-o1 v2: Towards Widening The Distillation Bottleneck for Reasoning Models
ACL 2025
Wings: Learning Multimodal LLMs without Text-only Forgetting
NIPS 2024
Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees
NIPS 2024