Wei Ji

63 papers · 2018–2026 · 12 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🌍 Conference Polyglot (12) 🏃 Academic Marathon (7) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (13)

🐝 Cross-Pollinator (13) 🌈 Renaissance Researcher (9) 🗺️ Taxonomy Completionist (83) 🏆 Grand Slam 🔬 Deep Specialist (10) 🧬 Topic Evolution 👥 Mega-Team (20) 👑 Triple Crown 🤝 Dynamic Duo (20) 🗃️ Keyword Collector (239) ⚡ Prolific Year (10) 🚀 Conference Pioneer 🔥 Unstoppable (8) 💎 Century Club (60) ❓ The Questioner (2)

Conferences

AAAI (14) CVPR (10) ICCV (8) ICML (6) NIPS (6) ICLR (5) ACL (4) ECCV (3) EMNLP (3) MICCAI (2) IJCAI (1) INTERSPEECH (1)

Top co-authors

Tat-Seng Chua (20) Qi Bi (13) Jingjing Li (13) Li Cheng (10) Huchuan Lu (8) Yefeng Zheng (8) Jingjun Yi (8) Yongri Piao (7) Miao Zhang (7) Haolan Zhan (7)

Keywords

semantic segmentation (10) multimodal learning (9) video understanding (6) domain generalization (6) depth estimation (4) state space model (4) medical image segmentation (3) scene graph (3) domain adaptation (3) multi-modal learning (3) salient object detection (3) video question answering (3) transfer learning (3) representation learning (2) scene graph generation (2) action recognition (2) few-shot learning (2) contrastive learning (2) temporal dynamics (2) causal inference (2)

Papers

Towards Unified Vision-Language Models with Incomplete Multi-Modal Inputs AAAI 2026 Evolving Generalist Virtual Agents with Generative and Associative Memory AAAI 2026 SAM3-I: Segment Anything with Instructions ACL 2026 Discretized Gaussian Representation for Tomographic Reconstruction ICCV 2025 Generalized Video Moment Retrieval ICLR 2025 Learning Fine-grained Domain Generalization via Hyperbolic State Space Hallucination AAAI 2025 DGFamba: Learning Flow Factorized State Space for Visual Domain Generalization AAAI 2025 Few-Shot Incremental Learning via Foreground Aggregation and Knowledge Transfer for Audio-Visual Semantic Segmentation AAAI 2025 D-CAM: Learning Generalizable Weakly-Supervised Medical Image Segmentation from Domain-invariant CAM MICCAI 2025 DefMamba: Deformable Visual State Space Model CVPR 2025 SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and $\mathcalO(T)$ Complexity ICML 2025 What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensional Benchmark for Essential Virtual Agent Capabilities ICML 2025 A Simple yet Mighty Hartley Diffusion Versatilist for Generalizable Dense Vision Tasks ICCV 2025 SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior ICCV 2025 Panoptic Scene Graph Generation with Semantics-Prototype Learning AAAI 2024 Unleashing Multispectral Video's Potential in Semantic Segmentation: A Semi-supervised Viewpoint and New UAV-View Benchmark NIPS 2024 Samba: Severity-aware Recurrent Modeling for Cross-domain Medical Image Grading NIPS 2024 Learning Frequency-Adapted Vision Foundation Model for Domain Generalized Semantic Segmentation NIPS 2024 Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs CVPR 2024 Hallucinated Style Distillation for Single Domain Generalization in Medical Image Segmentation MICCAI 2024 Towards Robust Multi-Modal Reasoning via Model Selection ICLR 2024 Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching ECCV 2024 Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions ICLR 2024 Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition ICML 2024 NExT-GPT: Any-to-Any Multimodal LLM ICML 2024 NExT-Chat: An LMM for Chat, Detection and Segmentation ICML 2024 Spider: A Unified Framework for Context-dependent Concept Segmentation ICML 2024 Learning Generalized Medical Image Segmentation from Decoupled Feature Queries AAAI 2024 MedSegDiff-V2: Diffusion-Based Medical Image Segmentation with Transformer AAAI 2024 Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization ICLR 2024 DVSOD: RGB-D Video Salient Object Detection NIPS 2023 ART: rule bAsed futuRe-inference deducTion EMNLP 2023 Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models ICCV 2023 Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment ACL 2023 Generating Visual Spatial Description via Holistic 3D Scene Understanding ACL 2023 Two Heads Are Better Than One: Improving Fake News Video Detection by Correlating with Neighbors ACL 2023 FakeSV: A Multimodal Benchmark with Rich Social Context for Fake News Detection on Short Video Platforms AAAI 2023 Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World ICCV 2023 Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape ICCV 2023 Video-Audio Domain Generalization via Confounder Disentanglement AAAI 2023 VPGTrans: Transfer Visual Prompt Generator across LLMs NIPS 2023 WINNER: Weakly-Supervised hIerarchical decompositioN and aligNment for Spatio-tEmporal Video gRounding CVPR 2023 Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-Based Active Learning CVPR 2023 Multispectral Video Semantic Segmentation: A Benchmark Dataset and Baseline CVPR 2023 Rethinking the Two-Stage Framework for Grounded Situation Recognition AAAI 2022 Video as Conditional Graph Hierarchy for Multi-Granular Question Answering AAAI 2022 Content-Variant Reference Image Quality Assessment via Knowledge Distillation AAAI 2022 Invariant Grounding for Video Question Answering CVPR 2022 Generating Diverse and Natural 3D Human Motions From Text CVPR 2022 Exploring Denoised Cross-Video Contrast for Weakly-Supervised Temporal Action Localization CVPR 2022 Fine-Grained Scene Graph Generation with Data Transfer ECCV 2022 Video Question Answering: Datasets, Algorithms and Challenges EMNLP 2022 PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models EMNLP 2022 Promoting Saliency From Depth: Deep Unsupervised RGB-D Saliency Detection ICLR 2022 Dynamic Context-Sensitive Filtering Network for Video Salient Object Detection ICCV 2021 Calibrated RGB-D Salient Object Detection CVPR 2021 Learning Calibrated Medical Image Segmentation via Multi-Rater Agreement Modeling CVPR 2021 Joint Semantic Mining for Weakly Supervised RGB-D Salient Object Detection NIPS 2021 Boundary Proposal Network for Two-stage Natural Language Video Localization AAAI 2021 An Early Study on Intelligent Analysis of Speech Under COVID-19: Severity, Sleep Quality, Fatigue, and Anxiety INTERSPEECH 2020 Accurate RGB-D Salient Object Detection via Collaborative Learning ECCV 2020 Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection ICCV 2019 Semantic Locality-Aware Deformable Network for Clothing Segmentation IJCAI 2018