Ji Zhang
87 papers · 2014–2026 · 16 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π Conference Polyglot (16) π§ Keyword Pioneer π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (14) π Academic Marathon (11)
π§
Keyword Pioneer
π
Cross-Pollinator
(14)
π£
Hot Topic Early Bird
π€
Dynamic Duo
(34)
π
Grand Slam
π₯
Mega-Team
(21)
π¬
Deep Specialist
(26)
π§¬
Topic Evolution
π
Keyword Champion
(2)
β‘
Prolific Year
(17)
π
Trend Setter
π₯
Unstoppable
(9)
ποΈ
Keyword Collector
(375)
π
Century Club
(82)
π
Conference Pioneer
Conferences
ACL (17)
EMNLP (13)
CVPR (12)
AAAI (8)
IJCAI (7)
COLING (6)
ICML (5)
RSS (4)
ICCV (3)
ICLR (2)
IJCNLP (2)
INTERSPEECH (2)
NAACL (2)
NIPS (2)
MICCAI (1)
WACV (1)
Top co-authors
Keywords
large language model
(18)
contrastive learning
(10)
multimodal large language model
(10)
multimodal learning
(8)
knowledge distillation
(6)
dialogue system
(5)
language model
(5)
representation learning
(4)
in-context learning
(4)
transfer learning
(4)
self-supervised learning
(4)
few-shot learning
(4)
text generation
(3)
visual relationship
(3)
reinforcement learning
(3)
document understanding
(3)
mathematical reasoning
(3)
visual question answering
(3)
data augmentation
(3)
vision-language model
(3)
Papers
Talking Trails: LLM-Enhanced Spatiotemporal Trajectory Modeling for E-Bike Delivery Route Planning
AAAI 2026
ProFuser: Progressive Fusion of Large Language Models
AAAI 2026
MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding
ACL 2026
LeCoDe: A Benchmark Dataset for Interactive Legal Consultation Dialogue Evaluation
ACL 2026
Hierarchical Attention Network with Correction for Cross-Domain User Association
AAAI 2026
MOSAIC: Generating Consistent, Privacy-Preserving Scenes from Multiple Depth Views in Multi-Room Environments
ICCV 2025
Speculative Decoding for Multi-Sample Inference
EMNLP 2025
OccLoff: Learning Optimized Feature Fusion for 3D Occupancy Prediction
WACV 2025
Learning to Solve Domain-Specific Calculation Problems with Knowledge-Intensive Programs Generator
NAACL 2025
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization
CVPR 2025
Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves
CVPR 2025
A Simple yet Effective Layout Token in Large Language Models for Document Understanding
CVPR 2025
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization
CVPR 2025
MTCNet: Motion and Topology Consistency Guided Learning for Mitral Valve Segmentation in 4D Ultrasound
MICCAI 2025
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
ACL 2025
DISC: Plug-and-Play Decoding Intervention with Similarity of Characters for Chinese Spelling Check
ACL 2025
Filling the Missings: Spatiotemporal Data Imputation by Conditional Diffusion
IJCAI 2025
Revisiting Self-Consistency from Dynamic Distributional Alignment Perspective on Answer Aggregation
ACL 2025
Score as Action: Fine Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
ICML 2025
EGPlace: An Efficient Macro Placement Method via Evolutionary Search with Greedy Repositioning Guided Mutation
ICML 2025
Exploiting Presentative Feature Distributions for Parameter-Efficient Continual Learning of Large Language Models
ICML 2025
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
ICLR 2025
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration
NIPS 2024
MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model
NIPS 2024
TiMix: Text-Aware Image Mixing for Effective Vision-Language Pre-training
AAAI 2024
Browse and Concentrate: Comprehending Multimodal Content via Prior-LLM Context Fusion
ACL 2024
Model Composition for Multimodal Large Language Models
ACL 2024
SocialBench: Sociality Evaluation of Role-Playing Conversational Agents
ACL 2024
Towards Better Utilization of Multi-Reference Training Data for Chinese Grammatical Error Correction
ACL 2024
Budget-Constrained Tool Learning with Planning
ACL 2024
PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs
ACL 2024
CycleAlign: Iterative Distillation from Black-box LLM to White-box Models for Better Human Alignment
ACL 2024
Evaluating ChatNetZero, an LLM-Chatbot to Demystify Climate Pledges
ACL 2024
IAD: In-Context Learning Ability Decoupler of Large Language Models in Meta-Training
COLING 2024
Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training
COLING 2024
Unifying Latent and Lexicon Representations for Effective Video-Text Retrieval
COLING 2024
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
CVPR 2024
Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
CVPR 2024
DePT: Decoupled Prompt Tuning
CVPR 2024
SubT-MRS Dataset: Pushing SLAM Towards All-weather Environments
CVPR 2024
TinyChart: Efficient Chart Understanding with Program-of-Thoughts Learning and Visual Token Merging
EMNLP 2024
Small LLMs Are Weak Tool Learners: A Multi-LLM Agent
EMNLP 2024
A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language Models
EMNLP 2024
MIBench: Evaluating Multimodal Large Language Models over Multiple Images
EMNLP 2024
mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding
EMNLP 2024
Breaking Barriers of System Heterogeneity: Straggler-Tolerant Multimodal Federated Learning via Knowledge Distillation
IJCAI 2024
From Skepticism to Acceptance: Simulating the Attitude Dynamics Toward Fake News
IJCAI 2024
DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation
INTERSPEECH 2024
Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models
INTERSPEECH 2024
MCC-KD: Multi-CoT Consistent Knowledge Distillation
EMNLP 2023
Self-Supervised Category-Level Articulated Object Pose Estimation with Part-Level SE(3) Equivariance
ICLR 2023
Active Velocity Estimation using Light Curtains via Self-Supervised Multi-Armed Bandits
RSS 2023
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
EMNLP 2023
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models
EMNLP 2023
Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering
ACL 2023
DialoGPS: Dialogue Path Sampling in Continuous Semantic Space for Data Augmentation in Multi-Turn Conversations
ACL 2023
A Closer Look at Few-shot Classification Again
ICML 2023
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
ICML 2023
Improving Seq2Seq Grammatical Error Correction via Decoding Interventions
EMNLP 2023
DETA: Denoised Task Adaptation for Few-Shot Learning
ICCV 2023
HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training
ICCV 2023
ContrastMotion: Self-supervised Scene Motion Learning for Large-Scale LiDAR Point Clouds
IJCAI 2023
MGIMN: Multi-Grained Interactive Matching Network for Few-shot Text Classification
NAACL 2022
Continual Few-shot Intent Detection
COLING 2022
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections
EMNLP 2022
Shifting More Attention to Visual Backbone: Query-Modulated Refinement Networks for End-to-End Visual Grounding
CVPR 2022
Logit Perturbation
AAAI 2022
DictBERT: Dictionary Description Knowledge Enhanced Language Model Pre-training via Contrastive Learning
IJCAI 2022
Incorporating Causal Analysis into Diversified and Logical Response Generation
COLING 2022
i3dLoc: Image-to-range Cross-domain Localization Robust to Inconsistent Environmental Conditions
RSS 2021
Turn-Level User Satisfaction Estimation in E-commerce Customer Service
ACL 2021
KACE: Generating Knowledge Aware Contrastive Explanations for Natural Language Inference
ACL 2021
Accurate Few-Shot Object Detection With Support-Query Mutual Guidance and Hybrid Loss
CVPR 2021
AdaVQA: Overcoming Language Priors with Adapted Margin Cosine Loss
IJCAI 2021
MDNN: A Multimodal Deep Neural Network for Predicting Drug-Drug Interaction Events
IJCAI 2021
Testing Independence Between Linear Combinations for Causal Discovery
AAAI 2021
KACE: Generating Knowledge Aware Contrastive Explanations for Natural Language Inference
IJCNLP 2021
Turn-Level User Satisfaction Estimation in E-commerce Customer Service
IJCNLP 2021
Segment, Mask, and Predict: Augmenting Chinese Word Segmentation with Self-Supervision
EMNLP 2021
TARE: A Hierarchical Framework for Efficiently Exploring Complex 3D Environments
RSS 2021
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication
COLING 2020
A Deep Cascade Model for Multi-Document Reading Comprehension
AAAI 2019
Large-Scale Visual Relationship Understanding
AAAI 2019
Graphical Contrastive Losses for Scene Graph Parsing
CVPR 2019
Semi-Autoregressive Neural Machine Translation
EMNLP 2018
Relationship Proposal Networks
CVPR 2017
LOAM: Lidar Odometry and Mapping in Real-time
RSS 2014