conftrace_

Gedas Bertasius

45 papers · 2015–2026 · 10 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+13 more ↓ πŸƒ Academic Marathon (11) 🌍 Conference Polyglot (10) 🧭 Keyword Pioneer πŸŒ‰ Interdisciplinary Bridge 🐝 Cross-Pollinator (7)
🐝 Cross-Pollinator (7) 🌈 Renaissance Researcher (6) πŸ—ΊοΈ Taxonomy Completionist (66) πŸ”¬ Deep Specialist (13) 🀝 Dynamic Duo (19) πŸ‘₯ Mega-Team (100) ⚑ Prolific Year (5) πŸš€ Conference Pioneer πŸ“ˆ Trend Setter πŸ—ƒοΈ Keyword Collector (184) πŸ’Ž Century Club (45) πŸ”₯ Unstoppable (12) ❓ The Questioner (2)

Conferences

CVPR (18) ECCV (8) WACV (6) ICCV (5) EMNLP (2) NIPS (2) ACL (1) AISTATS (1) ICML (1) RSS (1)

Papers

Zero-Shot Audio-Visual Editing via Cross-Modal Delta Denoising WACV 2026 Enhancing Visual Planning with Auxiliary Tasks and Multi-token Prediction WACV 2026 TimeRefine: Temporal Grounding with Time Refining Video LLM WACV 2026 VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos CVPR 2025 VMAs: Video-to-Music Generation via Semantic Alignment in Web Music Videos WACV 2025 Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning EMNLP 2025 ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos CVPR 2025 DAM: Dynamic Adapter Merging for Continual Video QA Learning WACV 2025 BASKET: A Large-Scale Video Dataset for Fine-Grained Skill Estimation CVPR 2025 BIMBA: Selective-Scan Compression for Long-Range Video Question Answering CVPR 2025 4Diff: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation ECCV 2024 A Simple LLM Framework for Long-Range Video Question-Answering EMNLP 2024 Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences ACL 2024 LoCoNet: Long-Short Context Network for Active Speaker Detection CVPR 2024 Video ReCap: Recursive Captioning of Hour-Long Videos CVPR 2024 Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives CVPR 2024 Siamese Vision Transformers are Scalable Audio-visual Learners ECCV 2024 "Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos" ECCV 2024 RGNet: A Unified Clip Retrieval and Grounding Network for Long Videos ECCV 2024 Vision Transformers Are Parameter-Efficient Audio-Visual Learners CVPR 2023 Unified Coarse-to-Fine Alignment for Video-Text Retrieval ICCV 2023 SimpleClick: Interactive Image Segmentation with Simple Vision Transformers ICCV 2023 Efficient Movie Scene Detection Using State-Space Transformers CVPR 2023 VindLU: A Recipe for Effective Video-and-Language Pretraining CVPR 2023 Learning To Recognize Procedural Activities With Distant Supervision CVPR 2022 Long-Short Temporal Contrastive Learning of Video Transformers CVPR 2022 ECLIPSE: Efficient Long-Range Video Retrieval Using Sight and Sound ECCV 2022 TALLFormer: Temporal Action Localization with a Long-Memory Transformer ECCV 2022 Long Movie Clip Classification with State-Space Video Models ECCV 2022 Supervoxel Attention Graphs for Long-Range Video Modeling WACV 2021 Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs CVPR 2021 Is Space-Time Attention All You Need for Video Understanding? ICML 2021 COBE: Contextualized Object Embeddings from Narrated Instructional Video NIPS 2020 Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation CVPR 2020 Learning Temporal Pose Estimation from Sparsely-Labeled Videos NIPS 2019 Object Detection in Video with Spatiotemporal Sampling Networks ECCV 2018 Egocentric Basketball Motion Planning From a Single First-Person Image CVPR 2018 First-Person Action-Object Detection with EgoNet RSS 2017 Unsupervised Learning of Important Objects From First-Person Videos ICCV 2017 Am I a Baller? Basketball Performance Assessment From First-Person Videos ICCV 2017 Convolutional Random Walk Networks for Semantic Image Segmentation CVPR 2017 Local Perturb-and-MAP for Structured Prediction AISTATS 2017 Semantic Segmentation With Boundary Neural Fields CVPR 2016 DeepEdge: A Multi-Scale Bifurcated Deep Network for Top-Down Contour Detection CVPR 2015 High-for-Low and Low-for-High: Efficient Boundary Detection From Deep Object Features and its Applications to High-Level Vision ICCV 2015