Papers
PS3: A Multimodal Transformer Integrating Pathology Reports with Histology Images and Biological Pathways for Cancer Survival Prediction
Manahil Raza, Ayesha Azam, Talha Qaiser et al.
PseudoMapTrainer: Learning Online Mapping without HD Maps
Christian Löwens, Thorben Funke, Jingchao Xie et al.
Pseudo-SD: Pseudo Controlled Stable Diffusion for Semi-Supervised and Cross-Domain Semantic Segmentation
Dong Zhao, Qi Zang, Shuang Wang et al.
PS-Mamba: Spatial-Temporal Graph Mamba for Pose Sequence Refinement
Haoye Dong, Gim Hee Lee
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation
Rongyao Fang, Chengqi Duan, Kun Wang et al.
PUMPS: Skeleton-Agnostic Point-based Universal Motion Pre-Training for Synthesis in Human Motion Tasks
Clinton Ansun Mo, Kun Hu, Chengjiang Long et al.
Punching Bag vs. Punching Person: Motion Transferability in Videos
Raiyaan Abdullah, Jared Claypoole, Michael Cogswell et al.
Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics
Ruining Li, Chuanxia Zheng, Christian Rupprecht et al.
Purge-Gate: Backpropagation-Free Test-Time Adaptation for Point Clouds Classification via Token purging
Moslem Yazdanpanah, Ali Bahri, Mehrdad Noori et al.
Puzzle Similarity: A Perceptually-guided Cross-Reference Metric for Artifact Detection in 3D Scene Reconstructions
Nicolai Hermann, Jorge Condor, Piotr Didyk
PVChat: Personalized Video Chat with One-Shot Learning
Yufei Shi, Weilong Yan, Gang Xu et al.
PVMamba: Parallelizing Vision Mamba via Dynamic State Aggregation
Fei Xie, Zhongdao Wang, Weijia Zhang et al.
Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs
Shaojie Zhang, Jiahui Yang, Jianqin Yin et al.
QK-Edit: Revisiting Attention-based Injection in MM-DiT for Image and Video Editing
Tiancheng Shen, Zilong Huang, Xiangtai Li et al.
Q-Norm: Robust Representation Learning via Quality-Adaptive Normalization
Lanning Zhang, Ying Zhou, Fei Gao et al.
QR-LoRA: Efficient and Disentangled Fine-tuning via QR Decomposition for Customized Generation
Jiahui Yang, Yongjia Ma, Donglin Di et al.
Quadratic Gaussian Splatting: High Quality Surface Reconstruction with Second-order Geometric Primitives
Ziyu Zhang, Binbin Huang, Hanqing Jiang et al.
Quanta Neural Networks: From Photons to Perception
Varun Sundar, Tianyi Zhang, Sacha Jungerman et al.
QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation
Junyi Wu, Zhiteng Li, Zheng Hui et al.
Quantifying and Narrowing the Unknown: Interactive Text-to-Video Retrieval via Uncertainty Minimization
Bingqing Zhang, Zhuo Cao, Heming Du et al.
QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning
Haoxuan Wang, Yuzhang Shang, Zhihang Yuan et al.
QuickSplat: Fast 3D Surface Reconstruction via Learned Gaussian Initialization
Yueh-Cheng Liu, Lukas Höllein, Matthias Nießner et al.
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization
Yi Yang, Xiaoxuan He, Hongkun Pan et al.
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Jingyi Zhang, Jiaxing Huang, Huanjin Yao et al.
RA-BUSSeg: Relation-aware Semi-supervised Breast Ultrasound Image Segmentation via Adjacent Propagation and Cross-layer Alignment
Wanting Zhang, Zhenhui Ding, Guilian Chen et al.