Wei Ji
63 papers · 2018–2026 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π Conference Polyglot (12) π Academic Marathon (7) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (13)
π
Cross-Pollinator
(13)
π
Renaissance Researcher
(9)
πΊοΈ
Taxonomy Completionist
(83)
π
Grand Slam
π¬
Deep Specialist
(10)
π§¬
Topic Evolution
π₯
Mega-Team
(20)
π
Triple Crown
π€
Dynamic Duo
(20)
ποΈ
Keyword Collector
(239)
β‘
Prolific Year
(10)
π
Conference Pioneer
π₯
Unstoppable
(8)
π
Century Club
(60)
β
The Questioner
(2)
Conferences
AAAI (14)
CVPR (10)
ICCV (8)
ICML (6)
NIPS (6)
ICLR (5)
ACL (4)
ECCV (3)
EMNLP (3)
MICCAI (2)
IJCAI (1)
INTERSPEECH (1)
Top co-authors
Keywords
semantic segmentation
(10)
multimodal learning
(9)
video understanding
(6)
domain generalization
(6)
depth estimation
(4)
state space model
(4)
medical image segmentation
(3)
scene graph
(3)
domain adaptation
(3)
multi-modal learning
(3)
salient object detection
(3)
video question answering
(3)
transfer learning
(3)
representation learning
(2)
scene graph generation
(2)
action recognition
(2)
few-shot learning
(2)
contrastive learning
(2)
temporal dynamics
(2)
causal inference
(2)
Papers
Towards Unified Vision-Language Models with Incomplete Multi-Modal Inputs
AAAI 2026
Evolving Generalist Virtual Agents with Generative and Associative Memory
AAAI 2026
SAM3-I: Segment Anything with Instructions
ACL 2026
Discretized Gaussian Representation for Tomographic Reconstruction
ICCV 2025
Generalized Video Moment Retrieval
ICLR 2025
Learning Fine-grained Domain Generalization via Hyperbolic State Space Hallucination
AAAI 2025
DGFamba: Learning Flow Factorized State Space for Visual Domain Generalization
AAAI 2025
Few-Shot Incremental Learning via Foreground Aggregation and Knowledge Transfer for Audio-Visual Semantic Segmentation
AAAI 2025
D-CAM: Learning Generalizable Weakly-Supervised Medical Image Segmentation from Domain-invariant CAM
MICCAI 2025
DefMamba: Deformable Visual State Space Model
CVPR 2025
SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and $\mathcalO(T)$ Complexity
ICML 2025
What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensional Benchmark for Essential Virtual Agent Capabilities
ICML 2025
A Simple yet Mighty Hartley Diffusion Versatilist for Generalizable Dense Vision Tasks
ICCV 2025
SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior
ICCV 2025
Panoptic Scene Graph Generation with Semantics-Prototype Learning
AAAI 2024
Unleashing Multispectral Video's Potential in Semantic Segmentation: A Semi-supervised Viewpoint and New UAV-View Benchmark
NIPS 2024
Samba: Severity-aware Recurrent Modeling for Cross-domain Medical Image Grading
NIPS 2024
Learning Frequency-Adapted Vision Foundation Model for Domain Generalized Semantic Segmentation
NIPS 2024
Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs
CVPR 2024
Hallucinated Style Distillation for Single Domain Generalization in Medical Image Segmentation
MICCAI 2024
Towards Robust Multi-Modal Reasoning via Model Selection
ICLR 2024
Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching
ECCV 2024
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions
ICLR 2024
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
ICML 2024
NExT-GPT: Any-to-Any Multimodal LLM
ICML 2024
NExT-Chat: An LMM for Chat, Detection and Segmentation
ICML 2024
Spider: A Unified Framework for Context-dependent Concept Segmentation
ICML 2024
Learning Generalized Medical Image Segmentation from Decoupled Feature Queries
AAAI 2024
MedSegDiff-V2: Diffusion-Based Medical Image Segmentation with Transformer
AAAI 2024
Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization
ICLR 2024
DVSOD: RGB-D Video Salient Object Detection
NIPS 2023
ART: rule bAsed futuRe-inference deducTion
EMNLP 2023
Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models
ICCV 2023
Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment
ACL 2023
Generating Visual Spatial Description via Holistic 3D Scene Understanding
ACL 2023
Two Heads Are Better Than One: Improving Fake News Video Detection by Correlating with Neighbors
ACL 2023
FakeSV: A Multimodal Benchmark with Rich Social Context for Fake News Detection on Short Video Platforms
AAAI 2023
Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World
ICCV 2023
Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape
ICCV 2023
Video-Audio Domain Generalization via Confounder Disentanglement
AAAI 2023
VPGTrans: Transfer Visual Prompt Generator across LLMs
NIPS 2023
WINNER: Weakly-Supervised hIerarchical decompositioN and aligNment for Spatio-tEmporal Video gRounding
CVPR 2023
Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-Based Active Learning
CVPR 2023
Multispectral Video Semantic Segmentation: A Benchmark Dataset and Baseline
CVPR 2023
Rethinking the Two-Stage Framework for Grounded Situation Recognition
AAAI 2022
Video as Conditional Graph Hierarchy for Multi-Granular Question Answering
AAAI 2022
Content-Variant Reference Image Quality Assessment via Knowledge Distillation
AAAI 2022
Invariant Grounding for Video Question Answering
CVPR 2022
Generating Diverse and Natural 3D Human Motions From Text
CVPR 2022
Exploring Denoised Cross-Video Contrast for Weakly-Supervised Temporal Action Localization
CVPR 2022
Fine-Grained Scene Graph Generation with Data Transfer
ECCV 2022
Video Question Answering: Datasets, Algorithms and Challenges
EMNLP 2022
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models
EMNLP 2022
Promoting Saliency From Depth: Deep Unsupervised RGB-D Saliency Detection
ICLR 2022
Dynamic Context-Sensitive Filtering Network for Video Salient Object Detection
ICCV 2021
Calibrated RGB-D Salient Object Detection
CVPR 2021
Learning Calibrated Medical Image Segmentation via Multi-Rater Agreement Modeling
CVPR 2021
Joint Semantic Mining for Weakly Supervised RGB-D Salient Object Detection
NIPS 2021
Boundary Proposal Network for Two-stage Natural Language Video Localization
AAAI 2021
An Early Study on Intelligent Analysis of Speech Under COVID-19: Severity, Sleep Quality, Fatigue, and Anxiety
INTERSPEECH 2020
Accurate RGB-D Salient Object Detection via Collaborative Learning
ECCV 2020
Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection
ICCV 2019
Semantic Locality-Aware Deformable Network for Clothing Segmentation
IJCAI 2018