Jie Lei
41 papers · 2018–2025 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π£ Hot Topic Early Bird π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (10) π Interdisciplinary Bridge π Conference Polyglot (12)
π
Cross-Pollinator
(11)
π
Renaissance Researcher
(10)
π§
Keyword Pioneer
π₯
Mega-Team
(34)
π
Keyword Champion
(2)
π€
Dynamic Duo
(18)
ποΈ
Keyword Collector
(185)
β
The Questioner
β‘
Prolific Year
(9)
π
Conference Pioneer
π
Trend Setter
π
Century Club
(41)
π₯
Unstoppable
(6)
Conferences
AAAI (6)
CVPR (6)
NIPS (6)
ACL (4)
ECCV (4)
EMNLP (3)
ICCV (3)
NAACL (3)
ICML (2)
IJCAI (2)
IJCNLP (1)
WACV (1)
Top co-authors
Keywords
multimodal learning
(12)
video understanding
(7)
video question answering
(6)
video captioning
(4)
temporal modeling
(3)
adversarial learning
(3)
text-to-video retrieval
(3)
object detection
(3)
neural network
(3)
self-supervised learning
(3)
deepfake detection
(2)
unsupervised learning
(2)
few-shot learning
(2)
vision transformer
(2)
video retrieval
(2)
deep learning
(2)
model compression
(2)
image captioning
(2)
transformer architecture
(2)
zero-shot learning
(2)
Papers
STD-FD: Spatio-Temporal Distribution Fitting Deviation for AIGC Forgery Identification
ICML 2025
Spatial-Temporal Forgery Trace based Forgery Image Identification
ICCV 2025
Association Pattern-enhanced Molecular Representation Learning
AAAI 2025
CorrDetail: Visual Detail Enhanced Self-Correction for Face Forgery Detection
IJCAI 2025
Domain Adaptation for Large-Vocabulary Object Detectors
NIPS 2024
UNICORN: A Unified Causal Video-Oriented Language-Modeling Framework for Temporal Video-Language Tasks
EMNLP 2024
ViT-Calibrator: Decision Stream Calibration for Vision Transformer
AAAI 2024
Angle Robustness Unmanned Aerial Vehicle Navigation in GNSS-Denied Scenarios
AAAI 2024
JointSQ: Joint Sparsification-Quantization for Distributed Learning
CVPR 2024
DA-BEV: Unsupervised Domain Adaptation for Bird's Eye View Perception
ECCV 2024
SumCSE: Summary as a transformation for Contrastive Learning
NAACL 2024
E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection
NIPS 2024
Dual-Perspective Activation: Efficient Channel Denoising via Joint Forward-Backward Criterion for Artificial Neural Networks
NIPS 2024
Perceiver-VL: Efficient Vision-and-Language Modeling With Iterative Latent Attention
WACV 2023
Revealing Single Frame Bias for Video-and-Language Learning
ACL 2023
A Loopback Network for Explainable Microvascular Invasion Classification
CVPR 2023
Vision Transformers Are Parameter-Efficient Audio-Visual Learners
CVPR 2023
Toward Stable, Interpretable, and Lightweight Hyperspectral Super-Resolution
CVPR 2023
VindLU: A Recipe for Effective Video-and-Language Pretraining
CVPR 2023
ECLIPSE: Efficient Long-Range Video Retrieval Using Sight and Sound
ECCV 2022
RESIN-11: Schema-guided Event Prediction for 11 Newsworthy Scenarios
NAACL 2022
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
NIPS 2022
Transcoded Video Restoration by Temporal Spatial Auxiliary Network
AAAI 2022
Mutual-Complementing Framework for Nuclei Detection and Segmentation in Pathology Image
ICCV 2021
Unifying Vision-and-Language Tasks via Text Generation
ICML 2021
Edge-competing Pathological Liver Vessel Segmentation with Limited Labels
AAAI 2021
Boundary Knowledge Translation based Reference Semantic Segmentation
IJCAI 2021
mTVR: Multilingual Moment Retrieval in Videos
ACL 2021
mTVR: Multilingual Moment Retrieval in Videos
IJCNLP 2021
Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
CVPR 2021
DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization
NAACL 2021
Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models
ICCV 2021
Detecting Moments and Highlights in Videos via Natural Language Queries
NIPS 2021
LREN: Low-Rank Embedded Network for Sample-Free Hyperspectral Anomaly Detection
AAAI 2021
What is More Likely to Happen Next? Video-and-Language Future Event Prediction
EMNLP 2020
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
ACL 2020
TVQA+: Spatio-Temporal Grounding for Video Question Answering
ACL 2020
One-sample Guided Object Representation Disassembling
NIPS 2020
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
ECCV 2020
TVQA: Localized, Compositional Video Question Answering
EMNLP 2018
Selective Zero-Shot Classification with Augmented Attributes
ECCV 2018