Weijia Li

25 papers · 2021–2026 · 9 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🏃 Academic Marathon (5) 🌍 Conference Polyglot (8) 🗺️ Taxonomy Completionist (43)

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (8) 🤝 Dynamic Duo (18) 🏆 Keyword Champion (3) ❓ The Questioner (2) ⚡ Prolific Year (9) 💎 Century Club (23) 🗃️ Keyword Collector (109)

Conferences

CVPR (6) ICCV (6) AAAI (5) ACL (2) ECCV (2) EACL (1) EMNLP (1) ICLR (1) ICML (1)

Top co-authors

Conghui He (19) Junyan Ye (8) Dahua Lin (7) Huaping Zhong (6) Jinhua Yu (5) Linfeng Zhang (4) Gui-Song Xia (4) Haote Yang (4) Bin Wang (4) Baichuan Zhou (3)

Keywords

semantic segmentation (5) remote sensing (4) diffusion model (3) multimodal learning (3) building segmentation (3) multimodal large language model (2) monocular image (2) building extraction (2) satellite imagery (2) token pruning (2) height estimation (2) large multimodal model (2) building reconstruction (2) visual question answering (2) multi-view learning (2) parameter-efficient fine-tuning (2) large language model (2) scene understanding (1) change detection (1) image segmentation (1)

Papers

RoZO: Geometry-Aware Zeroth-Order Fine-Tuning on Low-Rank Adapters for Black-Box Large Language Models EACL 2026 MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing ACL 2026 Token Pruning in Multimodal Large Language Models: Are We Solving the Right Problem? ACL 2025 Stop Looking for “Important Tokens” in Multimodal Language Models: Duplication Matters More EMNLP 2025 Leveraging BEV Paradigm for Ground-to-Aerial Image Synthesis ICCV 2025 Where am I? Cross-View Geo-localization with Natural Language Descriptions ICCV 2025 LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models ICLR 2025 VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis AAAI 2025 Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration CVPR 2025 UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios AAAI 2025 LEGION: Learning to Ground and Explain for Synthetic Image Detection ICCV 2025 Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network ECCV 2024 VIGC: Visual Instruction Generation and Correction AAAI 2024 3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions CVPR 2024 SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation CVPR 2024 Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion Model CVPR 2024 Parrot Captions Teach CLIP to Spot Text ECCV 2024 AutoOS: Make Your OS More Powerful by Exploiting Large Language Models ICML 2024 An In-Depth Exploration of Person Re-Identification and Gait Recognition in Cloth-Changing Conditions CVPR 2023 SEPT: Towards Scalable and Efficient Visual Pre-training AAAI 2023 OmniCity: Omnipotent City Understanding With Multi-Level and Multi-View Images CVPR 2023 Large-Scale Land Cover Mapping with Fine-Grained Classes via Class-Aware Semi-Supervised Semantic Segmentation ICCV 2023 Influence Selection for Active Learning ICCV 2021 Joint Semantic-geometric Learning for Polygonal Building Segmentation AAAI 2021 3D Building Reconstruction From Monocular Remote Sensing Images ICCV 2021