Jie Lei

41 papers · 2018–2025 · 12 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (10) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (12)

🐝 Cross-Pollinator (11) 🌈 Renaissance Researcher (10) 🧭 Keyword Pioneer 👥 Mega-Team (34) 🏆 Keyword Champion (2) 🤝 Dynamic Duo (18) 🗃️ Keyword Collector (185) ❓ The Questioner ⚡ Prolific Year (9) 🚀 Conference Pioneer 📈 Trend Setter 💎 Century Club (41) 🔥 Unstoppable (6)

Conferences

AAAI (6) CVPR (6) NIPS (6) ACL (4) ECCV (4) EMNLP (3) ICCV (3) NAACL (3) ICML (2) IJCAI (2) IJCNLP (1) WACV (1)

Top co-authors

Mohit Bansal (18) Zunlei Feng (12) Mingli Song (11) Yunsong Li (7) Weiying Xie (7) Tamara Berg (7) Lechao Cheng (4) Kai Jiang (4) Xinchao Wang (4) Licheng Yu (4)

Keywords

multimodal learning (12) video understanding (7) video question answering (6) video captioning (4) temporal modeling (3) adversarial learning (3) text-to-video retrieval (3) object detection (3) neural network (3) self-supervised learning (3) deepfake detection (2) unsupervised learning (2) few-shot learning (2) vision transformer (2) video retrieval (2) deep learning (2) model compression (2) image captioning (2) transformer architecture (2) zero-shot learning (2)

Papers

STD-FD: Spatio-Temporal Distribution Fitting Deviation for AIGC Forgery Identification ICML 2025 Spatial-Temporal Forgery Trace based Forgery Image Identification ICCV 2025 Association Pattern-enhanced Molecular Representation Learning AAAI 2025 CorrDetail: Visual Detail Enhanced Self-Correction for Face Forgery Detection IJCAI 2025 Domain Adaptation for Large-Vocabulary Object Detectors NIPS 2024 UNICORN: A Unified Causal Video-Oriented Language-Modeling Framework for Temporal Video-Language Tasks EMNLP 2024 ViT-Calibrator: Decision Stream Calibration for Vision Transformer AAAI 2024 Angle Robustness Unmanned Aerial Vehicle Navigation in GNSS-Denied Scenarios AAAI 2024 JointSQ: Joint Sparsification-Quantization for Distributed Learning CVPR 2024 DA-BEV: Unsupervised Domain Adaptation for Bird's Eye View Perception ECCV 2024 SumCSE: Summary as a transformation for Contrastive Learning NAACL 2024 E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection NIPS 2024 Dual-Perspective Activation: Efficient Channel Denoising via Joint Forward-Backward Criterion for Artificial Neural Networks NIPS 2024 Perceiver-VL: Efficient Vision-and-Language Modeling With Iterative Latent Attention WACV 2023 Revealing Single Frame Bias for Video-and-Language Learning ACL 2023 A Loopback Network for Explainable Microvascular Invasion Classification CVPR 2023 Vision Transformers Are Parameter-Efficient Audio-Visual Learners CVPR 2023 Toward Stable, Interpretable, and Lightweight Hyperspectral Super-Resolution CVPR 2023 VindLU: A Recipe for Effective Video-and-Language Pretraining CVPR 2023 ECLIPSE: Efficient Long-Range Video Retrieval Using Sight and Sound ECCV 2022 RESIN-11: Schema-guided Event Prediction for 11 Newsworthy Scenarios NAACL 2022 Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners NIPS 2022 Transcoded Video Restoration by Temporal Spatial Auxiliary Network AAAI 2022 Mutual-Complementing Framework for Nuclei Detection and Segmentation in Pathology Image ICCV 2021 Unifying Vision-and-Language Tasks via Text Generation ICML 2021 Edge-competing Pathological Liver Vessel Segmentation with Limited Labels AAAI 2021 Boundary Knowledge Translation based Reference Semantic Segmentation IJCAI 2021 mTVR: Multilingual Moment Retrieval in Videos ACL 2021 mTVR: Multilingual Moment Retrieval in Videos IJCNLP 2021 Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling CVPR 2021 DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization NAACL 2021 Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models ICCV 2021 Detecting Moments and Highlights in Videos via Natural Language Queries NIPS 2021 LREN: Low-Rank Embedded Network for Sample-Free Hyperspectral Anomaly Detection AAAI 2021 What is More Likely to Happen Next? Video-and-Language Future Event Prediction EMNLP 2020 MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning ACL 2020 TVQA+: Spatio-Temporal Grounding for Video Question Answering ACL 2020 One-sample Guided Object Representation Disassembling NIPS 2020 TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval ECCV 2020 TVQA: Localized, Compositional Video Question Answering EMNLP 2018 Selective Zero-Shot Classification with Augmented Attributes ECCV 2018