Wenwei Zhang
44 papers · 2019–2025 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (11) πΊοΈ Taxonomy Completionist (10) π Interdisciplinary Bridge π Academic Marathon (6)
πΊοΈ
Taxonomy Completionist
(10)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π€
Dynamic Duo
(28)
π
Triple Crown
π
Grand Slam
π₯
Mega-Team
(24)
π§¬
Topic Evolution
π
Century Club
(44)
β‘
Prolific Year
(7)
β
The Questioner
(5)
π₯
Unstoppable
(7)
ποΈ
Keyword Collector
(178)
Conferences
CVPR (9)
ACL (7)
NIPS (7)
ICCV (6)
ECCV (4)
ICLR (4)
EMNLP (2)
ICML (2)
AAAI (1)
NAACL (1)
WACV (1)
Top co-authors
Research topics
Keywords
large language model
(13)
instance segmentation
(5)
vision-language model
(5)
benchmark evaluation
(4)
semantic segmentation
(4)
object detection
(4)
multimodal learning
(4)
instruction following
(3)
reinforcement learning
(3)
image segmentation
(3)
3d scene understanding
(3)
evaluation benchmark
(3)
vision language model
(3)
reward model
(3)
point cloud
(3)
kernel learning
(2)
video segmentation
(2)
instruction tuning
(2)
multi-modal learning
(2)
visual grounding
(2)
Papers
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
ACL 2025
Are Your LLMs Capable of Stable Reasoning?
ACL 2025
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
ICLR 2025
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
ICCV 2025
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data and Metric Perspectives
ICCV 2025
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D Capabilities
ICCV 2025
Training Language Models to Critique With Multi-agent Feedback
EMNLP 2025
CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward
EMNLP 2025
Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding
WACV 2025
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
ICLR 2025
F-LMM: Grounding Frozen Large Multimodal Models
CVPR 2025
OMG-Seg: Is One Model Good Enough For All Segmentation?
CVPR 2024
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
CVPR 2024
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities
ECCV 2024
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data
NIPS 2024
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
NIPS 2024
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models
NIPS 2024
CriticEval: Evaluating Large-scale Language Model as Critic
NIPS 2024
CLIM: Contrastive Language-Image Mosaic for Region Representation
AAAI 2024
ANAH: Analytical Annotation of Hallucinations in Large Language Models
ACL 2024
T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step
ACL 2024
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark
ACL 2024
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models
ACL 2024
Code Needs Comments: Enhancing Code LLMs with Comment Augmentation
ACL 2024
4D Contrastive Superflows are Dense 3D Representation Learners
ECCV 2024
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
ICLR 2024
Unified Human-Scene Interaction via Prompted Chain-of-Contacts
ICLR 2024
Can AI Assistants Know What They Donβt Know?
ICML 2024
Fake Alignment: Are LLMs Really Aligned Well?
NAACL 2024
MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training
CVPR 2023
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
ICCV 2023
Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation
ICCV 2023
Aligning Bag of Regions for Open-Vocabulary Object Detection
CVPR 2023
Segment Any Point Cloud Sequences by Distilling Vision Foundation Models
NIPS 2023
OV-PARTS: Towards Open-Vocabulary Part Segmentation
NIPS 2023
Dense Distinct Query for End-to-End Object Detection
CVPR 2023
Dense Siamese Network for Dense Unsupervised Learning
ECCV 2022
Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation
CVPR 2022
Seesaw Loss for Long-Tailed Instance Segmentation
CVPR 2021
K-Net: Towards Unified Image Segmentation
NIPS 2021
Side-Aware Boundary Localization for More Precise Object Detection
ECCV 2020
EcoNAS: Finding Proxies for Economical Neural Architecture Search
CVPR 2020
More Information Supervised Probabilistic Deep Face Embedding Learning
ICML 2020
Robust Multi-Modality Multi-Object Tracking
ICCV 2019