Tong Lu

43 papers · 2017–2026 · 7 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🐝 Cross-Pollinator (12) 🌈 Renaissance Researcher (7) 🏃 Academic Marathon (8) 🌍 Conference Polyglot (7) 🌉 Interdisciplinary Bridge

🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (69) 🧭 Keyword Pioneer 🤝 Dynamic Duo (20) 🧬 Topic Evolution 👥 Mega-Team (38) ⚡ Prolific Year (8) 🚀 Conference Pioneer 💎 Century Club (41) 📈 Trend Setter 🔥 Unstoppable (9) 🗃️ Keyword Collector (190) ❓ The Questioner

Conferences

AAAI (11) CVPR (9) ICCV (8) ICLR (5) IJCAI (4) ECCV (3) NIPS (3)

Top co-authors

Wenhai Wang (20) Zhe Chen (13) Jifeng Dai (11) Yu Qiao (10) Ping Luo (8) Xizhou Zhu (8) Enze Xie (8) Lewei Lu (7) Guo Chen (6) Zhiqi Li (6)

Keywords

semantic segmentation (9) convolutional neural network (7) transformer architecture (4) image generation (3) object detection (3) video understanding (3) efficient computing (2) action recognition (2) vision-language model (2) 3d vision (2) instance segmentation (2) temporal modeling (2) deformable convolution (2) representation learning (2) video recognition (2) scene text detection (2) transfer learning (2) image segmentation (2) image restoration (2) multimodal large language model (2)

Papers

Task-Aware Meta-Learning on Heterogeneous Knowledge Graph for POI Recommendation AAAI 2026 SciMKG: A Multimodal Knowledge Graph for Science Education with Text, Image, Video and Audio AAAI 2026 Deconfound Semantic Shift and Incompleteness in Incremental Few-shot Semantic Segmentation AAAI 2025 Egocentric Object-Interaction Anticipation with Retentive and Predictive Learning IJCAI 2025 OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text ICLR 2025 MOERL: When Mixture-of-Experts Meet Reinforcement Learning for Adverse Weather Image Restoration ICCV 2025 CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding ICLR 2025 Docopilot: Improving Multimodal Models for Document-Level Understanding CVPR 2025 Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures ICLR 2025 RepKPU: Point Cloud Upsampling with Kernel Point Representation and Deformation CVPR 2024 InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks CVPR 2024 Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications CVPR 2024 The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World ICLR 2024 CRA-PCN: Point Cloud Completion with Intra- and Inter-level Cross-Resolution Transformers AAAI 2024 AVSegFormer: Audio-Visual Segmentation with Transformer AAAI 2024 VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks NIPS 2024 Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving? CVPR 2024 FB-BEV: BEV Representation from Forward-Backward View Transformations ICCV 2023 Vision Transformer Adapter for Dense Predictions ICLR 2023 VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks NIPS 2023 Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and Transformer-Based Method AAAI 2023 Graph Propagation Transformer for Graph Representation Learning IJCAI 2023 InternImage: Exploring Large-Scale Vision Foundation Models With Deformable Convolutions CVPR 2023 Memory-and-Anticipation Transformer for Online Action Understanding ICCV 2023 DDP: Diffusion Model for Dense Visual Prediction ICCV 2023 Panoptic SegFormer: Delving Deeper Into Panoptic Segmentation With Transformers CVPR 2022 Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization AAAI 2022 DCAN: Improving Temporal Action Detection via Dual Context Aggregation AAAI 2022 BEVFormer: Learning Bird’s-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers ECCV 2022 SeedFormer: Patch Seeds Based Point Cloud Completion with Upsample Transformer ECCV 2022 TAM: Temporal Adaptive Module for Video Recognition ICCV 2021 Frequency Consistent Adaptation for Real World Super Resolution AAAI 2021 Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction Without Convolutions ICCV 2021 Spectrum-to-Kernel Translation for Accurate Blind Image Super-Resolution NIPS 2021 Adaptive Graph Convolution for Point Cloud Analysis ICCV 2021 TEINet: Towards an Efficient Architecture for Video Recognition AAAI 2020 AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting ECCV 2020 On Reinforcement Learning for Full-Length Game of StarCraft AAAI 2019 Shape Robust Text Detection With Progressive Scale Expansion Network CVPR 2019 Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network ICCV 2019 Mixed Link Networks IJCAI 2018 Temporal Action Localization by Structured Maximal Sums CVPR 2017 Deep-dense Conditional Random Fields for Object Co-segmentation IJCAI 2017