Lewei Lu

33 papers · 2020–2025 · 7 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🏃 Academic Marathon (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (7) 🧭 Keyword Pioneer 🐝 Cross-Pollinator (12)

🐝 Cross-Pollinator (12) 🌈 Renaissance Researcher (6) 🗺️ Taxonomy Completionist (49) 👥 Mega-Team (38) 🏆 Grand Slam 🤝 Dynamic Duo (24) ⚡ Prolific Year (15) 💎 Century Club (33) 🗃️ Keyword Collector (135) 📈 Trend Setter

Conferences

CVPR (15) ICLR (6) NIPS (5) ICCV (3) ECCV (2) AAAI (1) ICML (1)

Top co-authors

Jifeng Dai (24) Xizhou Zhu (22) Yu Qiao (18) Wenhai Wang (14) Zhe Chen (10) hongsheng Li (9) Tong Lu (7) Xiaogang Wang (7) Jie Zhou (6) Weijie Su (6)

Keywords

vision-language model (5) multimodal large language model (5) large language model (3) autonomous driving (3) object detection (3) image generation (3) semantic segmentation (3) multimodal document (2) contrastive learning (2) video processing (2) multi-task learning (2) visual representation (2) object localization (2) vision foundation model (2) convolutional neural network (2) knowledge distillation (2) multimodal learning (2) visual question answering (2) multi-modal learning (2) deformable convolution (2)

Papers

Streamline Without Sacrifice - Squeeze out Computation Redundancy in LMM ICML 2025 Docopilot: Improving Multimodal Models for Document-Level Understanding CVPR 2025 PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models CVPR 2025 MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction CVPR 2025 SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding CVPR 2025 HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding CVPR 2025 Spatial Preference Rewarding for MLLMs Spatial Understanding ICCV 2025 OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text ICLR 2025 Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures ICLR 2025 Weakly Supervised Monocular 3D Detection with a Single-View Image CVPR 2024 Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning NIPS 2024 Learning 1D Causal Visual Representation with De-focus Attention Networks NIPS 2024 InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks CVPR 2024 Masked AutoDecoder is Effective Multi-Task Vision Generalist CVPR 2024 Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications CVPR 2024 Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft CVPR 2024 Needle In A Multimodal Haystack NIPS 2024 Modeling Continuous Motion for 3D Point Cloud Object Tracking AAAI 2024 Parameter-Inverted Image Pyramid Networks NIPS 2024 LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors ICLR 2024 ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process ICLR 2024 ControlLLM: Augment Language Models with Tools by Searching on Graphs ECCV 2024 The All-Seeing Project V2: Towards General Relation Comprehension of the Open World ECCV 2024 VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks NIPS 2024 Distilling Focal Knowledge From Imperfect Expert for 3D Object Detection CVPR 2023 Planning-Oriented Autonomous Driving CVPR 2023 Scene as Occupancy ICCV 2023 Towards All-in-One Pre-Training via Maximizing Multi-Modal Mutual Information CVPR 2023 BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision CVPR 2023 InternImage: Exploring Large-Scale Vision Foundation Models With Deformable Convolutions CVPR 2023 FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting ICCV 2021 Deformable DETR: Deformable Transformers for End-to-End Object Detection ICLR 2021 VL-BERT: Pre-training of Generic Visual-Linguistic Representations ICLR 2020