Juncheng Li
53 papers · 2018–2026 · 14 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+17 more ↓ Show less ↑
🌍 Conference Polyglot (14) 🏃 Academic Marathon (7) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (12)
🌉
Interdisciplinary Bridge
🧭
Keyword Pioneer
🐝
Cross-Pollinator
(12)
🤝
Dynamic Duo
(31)
👑
Triple Crown
🏆
Grand Slam
👥
Mega-Team
(32)
🔬
Deep Specialist
(17)
🧬
Topic Evolution
🏆
Keyword Champion
❓
The Questioner
(2)
📈
Trend Setter
🗃️
Keyword Collector
(219)
🔥
Unstoppable
(8)
⚡
Prolific Year
(17)
💎
Century Club
(51)
🚀
Conference Pioneer
Conferences
CVPR (12)
ICCV (9)
ICML (6)
NIPS (5)
AAAI (4)
INTERSPEECH (4)
ACL (3)
EMNLP (3)
IJCAI (2)
AISTATS (1)
COLING (1)
ECCV (1)
ICLR (1)
MICCAI (1)
Top co-authors
Keywords
multimodal large language model
(8)
multimodal learning
(5)
convolutional neural network
(4)
large language model
(3)
diffusion model
(3)
vision-language model
(3)
multiple instance learning
(2)
active learning
(2)
adversarial attack
(2)
image generation
(2)
domain generalization
(2)
reinforcement learning
(2)
zero-shot learning
(2)
image super-resolution
(2)
scene graph
(2)
deep learning
(2)
uncertainty quantification
(2)
text-to-image generation
(2)
few-shot learning
(2)
self-supervised learning
(2)
Papers
MoA: Heterogeneous Mixture of Adapters for Parameter-Efficient Fine-Tuning of Large Language Models
ACL 2026
Evolving Generalist Virtual Agents with Generative and Associative Memory
AAAI 2026
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
CVPR 2025
SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation
CVPR 2025
Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene
CVPR 2025
The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation
ICCV 2025
Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness
ICCV 2025
What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensional Benchmark for Essential Virtual Agent Capabilities
ICML 2025
TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition
ACL 2025
Align2LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation
ACL 2025
Choice is what matters after Attention
AISTATS 2025
ITERATE: Image-Text Enhancement, Retrieval, and Alignment for Transmodal Evolution with LLMs
COLING 2025
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
ICCV 2025
IDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselves
ICCV 2025
Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
ICCV 2025
Boosting Virtual Agent Learning and Reasoning: A Step-Wise, Multi-Dimensional, and Generalist Reward Model with Benchmark
ICML 2025
STEP: Enhancing Video-LLMs' Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training
CVPR 2025
On Path to Multimodal Generalist: General-Level and General-Bench
ICML 2025
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens
CVPR 2025
Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
ICML 2024
Unified Generative and Discriminative Training for Multi-modal Large Language Models
NIPS 2024
Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration
NIPS 2024
DIEM: Decomposition-Integration Enhancing Multimodal Insights
CVPR 2024
Revisiting the Domain Shift and Sample Uncertainty in Multi-source Active Domain Transfer
CVPR 2024
Learning Coupled Dictionaries from Unpaired Data for Image Super-Resolution
CVPR 2024
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data
CVPR 2024
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions
ICLR 2024
Auto-Encoding Morph-Tokens for Multimodal LLM
ICML 2024
Vulnerabilities of Single-Round Incentive Compatibility in Auto-bidding: Theory and Evidence from ROI-Constrained Online Advertising Markets
IJCAI 2024
Topological GCN for Improving Detection of Hip Landmarks from B-Mode Ultrasound Images
MICCAI 2024
Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document
EMNLP 2023
Self-supervised Meta-Prompt Learning with Meta-Gradient Regularization for Few-shot Generalization
EMNLP 2023
Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models
ICCV 2023
Reasoning Makes Good Annotators : An Automatic Task-specific Rules Distilling Framework for Low-resource Relation Extraction
EMNLP 2023
Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-Based Active Learning
CVPR 2023
Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World
ICCV 2023
MAGIC: Multimodal relAtional Graph adversarIal inferenCe for Diverse and Unpaired Text-Based Image Captioning
AAAI 2022
Compositional Temporal Grounding With Structured Variational Cross-Graph Correspondence Learning
CVPR 2022
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
INTERSPEECH 2022
Fine-Grained Semantically Aligned Vision-Language Pre-Training
NIPS 2022
Feature Distillation Interaction Weighting Network for Lightweight Image Super-resolution
AAAI 2022
Lightweight Bimodal Network for Single-Image Super-Resolution via Symmetric CNN and Recursive Transformer
IJCAI 2022
Masked Autoencoders that Listen
NIPS 2022
Hierarchical Phone Recognition with Compositional Phonetics
INTERSPEECH 2021
Structure-Preserving Deraining With Residue Channel Prior Guidance
ICCV 2021
Adaptive Hierarchical Graph Reasoning With Semantic Coherence for Video-and-Language Inference
ICCV 2021
Towards Zero-Shot Learning for Automatic Phonemic Transcription
AAAI 2020
Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation
CVPR 2020
Adversarial Music: Real world Audio Adversary against Wake-word Detection System
NIPS 2019
Adversarial camera stickers: A physical camera-based attack on deep learning systems
ICML 2019
Comparing the Max and Noisy-Or Pooling Functions in Multiple Instance Learning for Weakly Supervised Sequence Learning Tasks
INTERSPEECH 2018
Multi-scale Residual Network for Image Super-Resolution
ECCV 2018
Multiple Instance Deep Learning for Weakly Supervised Small-Footprint Audio Event Detection
INTERSPEECH 2018