Jian Luan

41 papers · 2019–2026 · 8 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌍 Conference Polyglot (8) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (20) 🏃 Academic Marathon (6)

🏃 Academic Marathon (6) 🐝 Cross-Pollinator (14) 🌈 Renaissance Researcher (8) 🏆 Keyword Champion (2) 🤝 Dynamic Duo (23) 🧬 Topic Evolution ⚡ Prolific Year (16) 🔥 Unstoppable (5) 💎 Century Club (36) 📈 Trend Setter 🗃️ Keyword Collector (198) ❓ The Questioner

Conferences

ACL (17) EMNLP (7) INTERSPEECH (7) AAAI (3) NAACL (3) COLING (2) ICCV (1) IJCAI (1)

Top co-authors

Wei Liu (26) Bin Wang (21) Xiang Li (5) Qinzhuo Wu (5) Weikai Xu (4) Shuo Shang (4) Rui Yan (4) Jie Wu (4) Wen Zhang (3) Zheng Lin (3)

Research topics

Architectures (1)

Keywords

large language model (13) model compression (6) reinforcement learning (4) attention mechanism (3) language model (3) multimodal learning (3) in-context learning (3) data augmentation (3) task completion (2) transformer architecture (2) mobile agent (2) simultaneous translation (2) few-shot learning (2) polyphonic music (2) speech synthesis (2) vision-language model (2) visual reasoning (2) knowledge distillation (2) instruction tuning (2) multimodal large language model (2)

Papers

AV-Edit: Multimodal Generative Sound Effect Editing via Audio-Visual Semantic Joint Control AAAI 2026 End-to-End Optimization of LLM-Driven Multi-Agent Search Systems via Heterogeneous-Group-Based Reinforcement Learning ACL 2026 VecInfer: Efficient LLM Inference with Low-Bit KV Cache via Outlier-Suppressed Vector Quantization ACL 2026 Doc-V*: Coarse-to-Fine Interactive Visual Reasoning for Multi-Page Document VQA ACL 2026 Attention Basin: Why Contextual Position Matters in Large Language Models ACL 2026 Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study NAACL 2025 ReachAgent: Enhancing Mobile Agent via Page Reaching and Operation NAACL 2025 Browsing Like Human: A Multimodal Web Agent with Experiential Fast-and-Slow Thinking ACL 2025 Demystifying Small Language Models for Edge Deployment ACL 2025 BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking Mechanism EMNLP 2025 HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation ACL 2025 Weaving Context Across Images: Improving Vision-Language Models through Focus-Centric Visual Chains ACL 2025 More is not always better? Enhancing Many-Shot In-Context Learning with Differentiated and Reweighting Objectives ACL 2025 TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization ACL 2025 PMSS: Pretrained Matrices Skeleton Selection for LLM Fine-tuning COLING 2025 Stability and Generalization of Zeroth-Order Decentralized Stochastic Gradient Descent with Changing Topology AAAI 2025 Global Eye: Breaking the “Fixed Thinking Pattern” during the Instruction Expansion Process ACL 2025 MAKAR: a Multi-Agent framework based Knowledge-Augmented Reasoning for Grounded Multimodal Named Entity Recognition EMNLP 2025 SPO: Self Preference Optimization with Self Regularization EMNLP 2025 Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs ICCV 2025 Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization IJCAI 2025 MobileVLM: A Vision-Language Model for Better Intra- and Inter-UI Understanding EMNLP 2024 Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents ACL 2024 DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy ACL 2024 Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations ACL 2024 A Comprehensive Evaluation of Quantization Strategies for Large Language Models ACL 2024 ToolRerank: Adaptive and Hierarchy-Aware Reranking for Tool Retrieval COLING 2024 ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback EMNLP 2024 Mixture of Diverse Size Experts EMNLP 2024 The Xiaomi AI Lab’s Speech Translation Systems for IWSLT 2023 Offline Task, Simultaneous Task and Speech-to-Speech Task ACL 2023 BERT-ERC: Fine-Tuning BERT Is Enough for Emotion Recognition in Conversation AAAI 2023 Exploring All-In-One Knowledge Distillation Framework for Neural Machine Translation EMNLP 2023 Exploring Better Text Image Translation with Multimodal Codebook ACL 2023 Improving Bilingual TTS Using Language And Phonology Embedding With Embedding Strength Modulator INTERSPEECH 2023 LightClone: Speaker-guided Parallel Subnet Selection for Few-shot Voice Cloning INTERSPEECH 2023 BIT-Xiaomi’s System for AutoSimTrans 2022 NAACL 2022 Transfer Learning for Improving Singing-Voice Detection in Polyphonic Instrumental Music INTERSPEECH 2020 Re-Weighted Interval Loss for Handling Data Imbalance Problem of End-to-End Keyword Spotting INTERSPEECH 2020 Adversarially Trained Multi-Singer Sequence-to-Sequence Singing Synthesizer INTERSPEECH 2020 XiaoiceSing: A High-Quality and Integrated Singing Voice Synthesis System INTERSPEECH 2020 Vocal Pitch Extraction in Polyphonic Music Using Convolutional Residual Network INTERSPEECH 2019