Papers
Large-scale Pre-training for Grounded Video Caption Generation
Evangelos Kazakos, Cordelia Schmid, Josef Sivic
Large Scene Generation with Cube-Absorb Discrete Diffusion
Qianjiang Hu, Wei Hu
Lark: Low-Rank Updates After Knowledge Localization for Few-shot Class-Incremental Learning
Jinxin Shi, Jiabao Zhao, Yifan Yang et al.
Latent Diffusion Models with Masked AutoEncoders
Junho Lee, Jeongwoo Shin, Hyungwook Choi et al.
Latent Expression Generation for Referring Image Segmentation and Grounding
Seonghoon Yu, Joonbeom Hong, Joonseok Lee et al.
Latent-Reframe: Enabling Camera Control for Video Diffusion Models without Training
Zhenghong Zhou, Jie An, Jiebo Luo
Latent Swap Joint Diffusion for 2D Long-Form Latent Generation
Yusheng Dai, Chenxi Wang, Chang Li et al.
LATINO-PRO: LAtent consisTency INverse sOlver with PRompt Optimization
Alessio Spagnoletti, Jean Prost, Andrés Almansa et al.
Latte: Collaborative Test-Time Adaptation of Vision-Language Models in Federated Learning
Wenxuan Bao, Ruxi Deng, Ruizhong Qiu et al.
LawDIS: Language-Window-based Controllable Dichotomous Image Segmentation
Xinyu Yan, Meijun Sun, Ge-Peng Ji et al.
Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation
Ao Ma, Jiasong Feng, Ke Cao et al.
LayerAnimate: Layer-level Control for Animation
Yuxue Yang, Lue Fan, Zuzeng Lin et al.
LayerD: Decomposing Raster Graphic Designs into Layers
Tomoyuki Suzuki, Kang-Jun Liu, Naoto Inoue et al.
LayerLock: Non-collapsing Representation Learning with Progressive Freezing
Goker Erdogan, Nikhil Parthasarathy, Catalin Ionescu et al.
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer
Yiren Song, Danze Chen, Mike Zheng Shou
Layer-wise Vision Injection with Disentangled Attention for Efficient LVLMs
Xuange Zhang, Dengjie Li, Bo Liu et al.
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
Divyansh Srivastava, Xiang Zhang, He Wen et al.
LazyMAR: Accelerating Masked Autoregressive Models via Feature Caching
Feihong Yan, Qingyan Wei, Jiayi Tang et al.
LBM: Latent Bridge Matching for Fast Image-to-Image Translation
Clément Chadebec, Onur Tasar, Sanjeev Sreetharan et al.
LDIP: Long Distance Information Propagation for Video Super-Resolution
Michael Bernasconi, Abdelaziz Djelouah, Yang Zhang et al.
LDPose: Towards Inclusive Human Pose Estimation for Limb-Deficient Individuals in the Wild
Jiaying Ying, Heming Du, Kaihao Zhang et al.
LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling
Huaqiu Li, Yong Wang, Tongwen Huang et al.
LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models
Yu Cheng, Fajie Yuan
Leaps and Bounds: An Improved Point Cloud Winding Number Formulation for Fast Normal Estimation and Surface Reconstruction
Chamin Hewa Koneputugodage, Dylan Campbell, Stephen Gould
Learn2Synth: Learning Optimal Data Synthesis Using Hypergradients for Brain Image Segmentation
Xiaoling Hu, Xiangrui Zeng, Oula Puonti et al.