conftrace_

Yali Wang

55 papers · 2012–2026 · 9 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+14 more ↓ 🌍 Conference Polyglot (9) πŸƒ Academic Marathon (13) 🧭 Keyword Pioneer πŸŒ‰ Interdisciplinary Bridge 🐣 Hot Topic Early Bird
🧭 Keyword Pioneer 🐝 Cross-Pollinator (15) πŸ—ΊοΈ Taxonomy Completionist (74) 🀝 Dynamic Duo (40) πŸ‘‘ Triple Crown πŸ† Grand Slam πŸ‘₯ Mega-Team (38) 🧬 Topic Evolution πŸ† Keyword Champion πŸš€ Conference Pioneer πŸ—ƒοΈ Keyword Collector (190) ⚑ Prolific Year (10) πŸ”₯ Unstoppable (10) πŸ’Ž Century Club (51)

Conferences

CVPR (17) AAAI (10) ICLR (9) ICCV (7) ECCV (5) ICML (3) NIPS (2) AISTATS (1) IJCAI (1)

Papers

VideoChat-A1: Thinking with Long Videos by Chain-of-Shot Reasoning AAAI 2026 VRAgent-R1: Boosting Video Recommendation with MLLM-based Agents via Reinforcement Learning AAAI 2026 When Top-ranked Recommendations Fail: Modeling Multi-Granular Negative Feedback for Explainable and Robust Video Recommendation AAAI 2026 G-UBS: Towards Robust Understanding of Implicit Feedback via Group-Aware User Behavior Simulation AAAI 2026 VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos ICCV 2025 LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents ICCV 2025 H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving AAAI 2025 Muses: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration AAAI 2025 TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision ICML 2025 TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning ICLR 2025 Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel ICLR 2025 CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding ICLR 2025 Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning ICLR 2025 OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text ICLR 2025 Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment CVPR 2025 WeGen: A Unified Model for Interactive Multimodal Generation as We Chat CVPR 2025 V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents CVPR 2025 TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration NIPS 2024 M-BEV: Masked BEV Perception for Robust Autonomous Driving AAAI 2024 InternVideo2: Scaling Foundation Models for Multimodal Video Understanding ECCV 2024 MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI ICML 2024 VideoMamba: State Space Model for Efficient Video Understanding ECCV 2024 MVBench: A Comprehensive Multi-modal Video Understanding Benchmark CVPR 2024 Vlogger: Make Your Dream A Vlog CVPR 2024 EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World CVPR 2024 SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction ICLR 2024 InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation ICLR 2024 UniFormerV2: Unlocking the Potential of Image ViTs for Video Understanding ICCV 2023 VideoMAE V2: Scaling Video Masked Autoencoders With Dual Masking CVPR 2023 Starting From Non-Parametric Networks for 3D Point Cloud Analysis CVPR 2023 MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling With Informative-Preserved Reconstruction and Self-Distilled Consistency CVPR 2023 Unmasked Teacher: Towards Training-Efficient Video Foundation Models ICCV 2023 HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation ICCV 2023 Cross Domain Object Detection by Target-Perceived Dual Branch Distillation CVPR 2022 Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection CVPR 2022 UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning ICLR 2022 MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning ECCV 2022 Dual-AI: Dual-Path Actor Interaction Learning for Group Activity Recognition CVPR 2022 Self-Slimmed Vision Transformer ECCV 2022 Digging Into Uncertainty in Self-Supervised Multi-View Stereo ICCV 2021 PC-HMR: Pose Calibration for 3D Human Mesh Recovery from 2D Images/Videos AAAI 2021 CT-Net: Channel Tensorization Network for Video Classification ICLR 2021 SmallBigNet: Integrating Core and Contextual Views for Video Classification CVPR 2020 Learning Attentive Pairwise Interaction for Fine-Grained Classification AAAI 2020 Context-Transformer: Tackling Object Confusion for Few-Shot Detection AAAI 2020 Mining Inter-Video Proposal Relations for Video Object Detection ECCV 2020 Adaptive Pyramid Context Network for Semantic Segmentation CVPR 2019 PA3D: Pose-Action 3D Machine for Video Recognition CVPR 2019 MetaCleaner: Learning to Hallucinate Clean Representations for Noisy-Labeled Visual Recognition CVPR 2019 Temporal Hallucinating for Action Recognition With Few Still Images CVPR 2018 RPAN: An End-To-End Recurrent Pose-Attention Network for Action Recognition in Videos ICCV 2017 Sequential Inference for Deep Gaussian Process AISTATS 2016 Gaussian Processes for Bayesian Estimation in Ordinary Differential Equations ICML 2014 A KNN Based Kalman Filter Gaussian Process Regression IJCAI 2013 A Marginalized Particle Gaussian Process Regression NIPS 2012