Zhen Li
108 papers · 2011–2026 · 19 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+17 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (11) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (5) π Conference Polyglot (19)
π
Interdisciplinary Bridge
π
Academic Marathon
(15)
πΊοΈ
Taxonomy Completionist
(11)
π
Conference Loyalist
(26)
π
Grand Slam
π
Triple Crown
π€
Dynamic Duo
(31)
π₯
Mega-Team
(20)
π¬
Deep Specialist
(23)
π§¬
Topic Evolution
π
Keyword Champion
(2)
π
Conference Pioneer
β‘
Prolific Year
(17)
π
Century Club
(104)
π
Trend Setter
π₯
Unstoppable
(12)
ποΈ
Keyword Collector
(456)
Conferences
CVPR (26)
AAAI (18)
ICCV (13)
NIPS (10)
ACL (6)
IJCAI (6)
ECCV (5)
EMNLP (5)
MICCAI (4)
COLING (3)
NAACL (2)
ICML (2)
ICLR (2)
JMLR (1)
MIDL (1)
ACML (1)
NSDI (1)
SEMEVAL (1)
WACV (1)
Top co-authors
Research topics
Keywords
point cloud
(12)
multimodal learning
(9)
semantic segmentation
(9)
contrastive learning
(8)
knowledge distillation
(7)
diffusion model
(6)
self-supervised learning
(6)
3d object detection
(6)
neural network
(5)
3d vision
(5)
autonomous driving
(5)
compositional generalization
(5)
attention mechanism
(5)
large language model
(5)
multi-modal learning
(5)
visual question answering
(5)
vision-language model
(4)
text generation
(4)
scene understanding
(4)
metric learning
(3)
Papers
Composition-Incremental Learning for Compositional Generalization
AAAI 2026
PiSA: A Self-Augmented Data Engine and Training Strategy for 3D Understanding with Large Models
WACV 2026
Cancer Survival Prediction by Cyclic Generation and Multi-grained Alignment
AAAI 2026
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models
AAAI 2026
DriveFlow: Rectified Flow Adaptation for Robust 3D Object Detection in Autonomous Driving
AAAI 2026
A General Framework to Enhance Fine-tuning-based LLM Unlearning
ACL 2025
Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models
ACL 2025
Multi-Sourced Compositional Generalization in Visual Question Answering
IJCAI 2025
GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding
ICLR 2025
AR-1-to-3: Single Image to Consistent 3D Object via Next-View Prediction
ICCV 2025
Lumina-Image 2.0: A Unified and Efficient Image Generative Framework
ICCV 2025
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning
ICCV 2025
DCP: Dual-Cue Pruning for Efficient Large Vision-Language Models
EMNLP 2025
VisionPAD: A Vision-Centric Pre-training Paradigm for Autonomous Driving
CVPR 2025
DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering
CVPR 2025
Cervical-RG: Automated Cervical Cancer Report Generation from 3D Multi-sequence MRI via CoT-guided Hierarchical Experts
MICCAI 2025
VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering
AAAI 2025
Topo2Seq: Enhanced Topology Reasoning via Topology Sequence Learning
AAAI 2025
Consistency of Compositional Generalization Across Multiple Levels
AAAI 2025
DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation
CVPR 2025
Empowering Large Language Models with 3D Situation Awareness
CVPR 2025
K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs
CVPR 2025
Sign2Vis: Automated Data Visualization from Sign Language
ACL 2025
Leveraging Large Language Models for NLG Evaluation: Advances and Challenges
EMNLP 2024
SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models by Searching Up-to-Date Internet Knowledge
NIPS 2024
Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection
NIPS 2024
CrossBind: Collaborative Cross-Modal Identification of Protein Nucleic-Acid-Binding Residues
AAAI 2024
X4D-SceneFormer: Enhanced Scene Understanding on 4D Point Cloud Videos through Cross-Modal Knowledge Transfer
AAAI 2024
WeakPCSOD: Overcoming the Bias of Box Annotations for Weakly Supervised Point Cloud Salient Object Detection
AAAI 2024
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation
AAAI 2024
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
CVPR 2024
Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
CVPR 2024
MonoTTA: Fully Test-Time Adaptation for Monocular 3D Object Detection
ECCV 2024
Compositional Substitutivity of Visual Reasoning for Visual Question Answering
ECCV 2024
In-Context Compositional Generalization for Large Vision-Language Models
EMNLP 2024
DV-3DLane: End-to-end Multi-modal 3D Lane Detection with Dual-view Representation
ICLR 2024
Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding
ICML 2024
EndoUIC: Promptable Diffusion Transformer for Unified Illumination Correction in Capsule Endoscopy
MICCAI 2024
Multilevel Causality Learning for Multi-label Gastric Atrophy Diagnosis
MICCAI 2024
Towards a Benchmark for Colorectal Cancer Segmentation in Endorectal Ultrasound Videos: Dataset and Model Development
MICCAI 2024
RankMatch: Fostering Confidence and Consistency in Learning with Noisy Labels
ICCV 2023
MMTN: Multi-Modal Memory Transformer Network for Image-Report Consistent Medical Report Generation
AAAI 2023
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation
CVPR 2023
Multi-View Inverse Rendering for Large-Scale Real-World Indoor Scenes
CVPR 2023
DNF: Decouple and Feedback Network for Seeing in the Dark
CVPR 2023
Learning Transformation-Predictive Representations for Detection and Description of Local Features
CVPR 2023
Toward Unpaired Multi-modal Medical Image Segmentation via Learning Structured Semantic Consistency
MIDL 2023
PersLEARN: Research Training through the Lens of Perspective Cultivation
ACL 2023
FAA: Fine-grained Attention Alignment for Cascade Document Ranking
ACL 2023
CowClip: Reducing CTR Prediction Model Training Time from 12 Hours to 10 Minutes on 1 GPU
AAAI 2023
Fair-CDA: Continuous and Directional Augmentation for Group Fairness
AAAI 2023
Amazon-M2: A Multilingual Multi-locale Shopping Session Dataset for Recommendation and Text Generation
NIPS 2023
Composable Text Controls in Latent Space with ODEs
EMNLP 2023
Geometry-Aware Network for Domain Adaptive Semantic Segmentation
AAAI 2023
Small Total-Cost Constraints in Contextual Bandits with Knapsacks, with Application to Fairness
NIPS 2023
SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence Pre-training
ICCV 2023
SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection
ICCV 2023
LATR: 3D Lane Detection from Monocular Images with Transformer
ICCV 2023
SRFormer: Permuted Self-Attention for Single Image Super-Resolution
ICCV 2023
Semantic Human Parsing via Scalable Semantic Transfer Over Multiple Label Domains
CVPR 2023
Exploring the Effect of Primitives for Compositional Generalization in Vision-and-Language
CVPR 2023
BEV@DC: Bird's-Eye View Assisted Training for Depth Completion
CVPR 2023
Contact-Distil: Boosting Low Homologous Protein Contact Map Prediction by Self-Supervised Distillation
AAAI 2022
Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds
CVPR 2022
AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation
NIPS 2022
An Error Analysis of Generative Adversarial Networks for Learning Distributions
JMLR 2022
Reciprocal Learning of Knowledge Retriever and Response Ranker for Knowledge-Grounded Conversations
COLING 2022
Contextual Bandits with Knapsacks for a Conversion Model
NIPS 2022
2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds
ECCV 2022
Weakly Supervised Object Localization through Inter-class Feature Similarity and Intra-Class Appearance Consistency
ECCV 2022
CLMLF:A Contrastive Learning and Multi-Layer Fusion Method for Multimodal Sentiment Detection
NAACL 2022
Towards an End-to-End Framework for Flow-Guided Video Inpainting
CVPR 2022
FCGCL: Fine- and Coarse-Granularity Contrastive Learning for Speech Translation
EMNLP 2022
X-Trans2Cap: Cross-Modal Knowledge Transfer Using Transformer for 3D Dense Captioning
CVPR 2022
Donβt Take It Literally: An Edit-Invariant Sequence Loss for Text Generation
NAACL 2022
Divide and Contrast: Source-free Domain Adaptation via Adaptive Contrastive Learning
NIPS 2022
Let Images Give You More: Point Cloud Cross-Modal Training for Shape Analysis
NIPS 2022
Graph Enhanced Contrastive Learning for Radiology Findings Summarization
ACL 2022
PhyIR: Physics-Based Inverse Rendering for Panoramic Indoor Images
CVPR 2022
Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion
AAAI 2021
Shallow Feature Matters for Weakly Supervised Object Localization
CVPR 2021
Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds
ICCV 2021
Free-Form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud
ICCV 2021
InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds Through Instance Multi-Level Contextual Referring
ICCV 2021
Temporal Modulation Network for Controllable Space-Time Video Super-Resolution
CVPR 2021
PSSM-Distil: Protein Secondary Structure Prediction (PSSP) on Low-Quality PSSM by Knowledge Distillation with Contrastive Learning
AAAI 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
ICML 2021
Local Representation is Not Enough: Soft Point-Wise Transformer for Descriptor and Detector of Local Features
IJCAI 2021
PointLIE: Locally Invertible Embedding for Point Cloud Sampling and Recovery
IJCAI 2021
Adaptive Residue-wise Profile Fusion for Low Homologous Protein Secondary Structure Prediction Using External Knowledge
IJCAI 2021
PointASNL: Robust Point Clouds Processing Using Nonlocal Neural Networks With Adaptive Sampling
CVPR 2020
Exemplar Normalization for Learning Deep Representation
CVPR 2020
CN-HIT-MI.T at SemEval-2020 Task 8: Memotion Analysis Based on BERT
SEMEVAL 2020
Hierarchical Chinese Legal event extraction via Pedal Attention Mechanism
COLING 2020
CN-HIT-MI.T at SemEval-2020 Task 8: Memotion Analysis Based on BERT
COLING 2020
Towards Content-Independent Multi-Reference Super-Resolution: Adaptive Pattern Matching and Feature Aggregation
ECCV 2020
BARNet: Bilinear Attention Network with Adaptive Receptive Fields for Surgical Instrument Segmentation
IJCAI 2020
Semi-Supervised Video Salient Object Detection Using Pseudo-Labels
ICCV 2019
Feedback Network for Image Super-Resolution
CVPR 2019
Deep Neural Nets with Interpolating Function as Output Activation
NIPS 2018
TuxΒ²: Distributed Graph Computation for Machine Learning
NSDI 2017
Learning Deep Semantic Embeddings for Cross-Modal Retrieval
ACML 2017
High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference
ICCV 2017
Protein Secondary Structure Prediction Using Cascaded Convolutional and Recurrent Neural Networks
IJCAI 2016
Blockout: Dynamic Model Selection for Hierarchical Deep Networks
CVPR 2016
Learning Semantic Relationships for Better Action Retrieval in Images
CVPR 2015
Learning Locally-Adaptive Decision Functions for Person Verification
CVPR 2013
Learning to Search Efficiently in High Dimensions
NIPS 2011