Papers
AeDet: Azimuth-Invariant Multi-View 3D Object Detection
Chengjian Feng, Zequn Jie, Yujie Zhong et al.
Affection: Learning Affective Explanations for Real-World Visual Data
Panos Achlioptas, Maks Ovsjanikov, Leonidas Guibas et al.
Affordance Diffusion: Synthesizing Hand-Object Interactions
Yufei Ye, Xueting Li, Abhinav Gupta et al.
Affordance Grounding From Demonstration Video To Target Image
Joya Chen, Difei Gao, Kevin Qinghong Lin et al.
Affordances From Human Videos as a Versatile Representation for Robotics
Shikhar Bahl, Russell Mendonca, Lili Chen et al.
AGAIN: Adversarial Training With Attribution Span Enlargement and Hybrid Feature Fusion
Shenglin Yin, Kelu Yao, Sheng Shi et al.
A Generalized Framework for Video Instance Segmentation
Miran Heo, Sukjun Hwang, Jeongseok Hyun et al.
A General Regret Bound of Preconditioned Gradient Method for DNN Training
Hongwei Yong, Ying Sun, Lei Zhang
A Hierarchical Representation Network for Accurate and Detailed Face Reconstruction From In-the-Wild Images
Biwen Lei, Jianqiang Ren, Mengyang Feng et al.
A-La-Carte Prompt Tuning (APT): Combining Distinct Data via Composable Prompting
Benjamin Bowman, Alessandro Achille, Luca Zancato et al.
A Large-Scale Homography Benchmark
Daniel Barath, Dmytro Mishkin, Michal Polic et al.
A Large-Scale Robustness Analysis of Video Action Recognition Models
Madeline Chantry Schiappa, Naman Biyani, Prudvi Kamtam et al.
Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations
Hagay Michaeli, Tomer Michaeli, Daniel Soudry
A Light Touch Approach to Teaching Transformers Multi-View Geometry
Yash Bhalgat, João F. Henriques, Andrew Zisserman
A Light Weight Model for Active Speaker Detection
Junhua Liao, Haihan Duan, Kanghui Feng et al.
Align and Attend: Multimodal Summarization With Dual Contrastive Losses
Bo He, Jun Wang, Jielin Qiu et al.
AligNeRF: High-Fidelity Neural Radiance Fields via Alignment-Aware Training
Yifan Jiang, Peter Hedman, Ben Mildenhall et al.
Aligning Bag of Regions for Open-Vocabulary Object Detection
Size Wu, Wenwei Zhang, Sheng Jin et al.
Aligning Step-by-Step Instructional Diagrams to Video Demonstrations
Jiahao Zhang, Anoop Cherian, Yanbin Liu et al.
Align Your Latents: High-Resolution Video Synthesis With Latent Diffusion Models
Andreas Blattmann, Robin Rombach, Huan Ling et al.
All Are Worth Words: A ViT Backbone for Diffusion Models
Fan Bao, Shen Nie, Kaiwen Xue et al.
All-in-Focus Imaging From Event Focal Stack
Hanyue Lou, Minggui Teng, Yixin Yang et al.
All in One: Exploring Unified Video-Language Pre-Training
Jinpeng Wang, Yixiao Ge, Rui Yan et al.
All-in-One Image Restoration for Unknown Degradations Using Adaptive Discriminative Filters for Specific Degradations
Dongwon Park, Byung Hyun Lee, Se Young Chun
ALOFT: A Lightweight MLP-Like Architecture With Dynamic Low-Frequency Transform for Domain Generalization
Jintao Guo, Na Wang, Lei Qi et al.