Papers
8,506 papers found
Is CLIP ideal? No. Can we fix it? Yes!
Raphi Kang, Yue Song, Georgia Gkioxari et al.
Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation
Zixin Wang, Dong Gong, Sen Wang et al.
Is Meta-Learning Out? Rethinking Unsupervised Few-Shot Classification with Limited Entropy
Yunchuan Guan, Yu Liu, Ke Zhou et al.
ISP2HRNet: Learning to Reconstruct High Resolution Image from Irregularly Sampled Pixels via Hierarchical Gradient Learning
Yuanlin Wang, Ruiqin Xiong, Rui Zhao et al.
Is Tracking Really More Challenging in First Person Egocentric Vision?
Matteo Dunnhofer, Zaira Manigrasso, Christian Micheloni
Is Visual in-Context Learning for Compositional Medical Tasks within Reach?
Simon Reiß, Zdravko Marinov, Alexander Jaus et al.
JailbreakDiffBench: A Comprehensive Benchmark for Jailbreaking Diffusion Models
Xiaolong Jin, Zixuan Weng, Hanxi Guo et al.
Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
Shiji Zhao, Ranjie Duan, Fengxiang Wang et al.
Jigsaw++: Imagining Complete Shape Priors for Object Reassembly
Jiaxin Lu, Gang Hua, Qixing Huang
Joint Asymmetric Loss for Learning with Noisy Labels
Jialiang Wang, Xianming Liu, Xiong Zhou et al.
Joint Diffusion Models in Continual Learning
Paweł Skierś, Kamil Deja
JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
Kwon Byung-Ki, Qi Dai, Lee Hyoseok et al.
Joint Self-Supervised Video Alignment and Action Segmentation
Ali Shah Ali, Syed Ahmed Mahmood, Mubin Saeed et al.
Joint Semantic and Rendering Enhancements in 3D Gaussian Modeling with Anisotropic Local Encoding
Jingming He, Chongyi Li, Shiqi Wang et al.
JPEG Processing Neural Operator for Backward-Compatible Coding
Woo Kyoung Han, Yongjun Lee, Byeonghun Lee et al.
Kaleidoscopic Background Attack: Disrupting Pose Estimation with Multi-Fold Radial Symmetry Textures
Xinlong Ding, Hongwei Yu, Jiawei Li et al.
Kaputt: A Large-Scale Dataset for Visual Defect Detection
Sebastian Höfer, Dorian F. Henning, Artemij Amiranashvili et al.
KDA: Knowledge Diffusion Alignment with Enhanced Context for Video Temporal Grounding
Ran Ran, Jiwei Wei, Shiyuan He et al.
Keep Your Friends Close, and Your Enemies Farther: Distance-aware Voxel-wise Contrastive Learning for Semi-supervised Multi-organ Segmentation
Haochen Zhao, Jianwei Niu, Xuefeng Liu et al.
Kestrel: 3D Multimodal LLM for Part-Aware Grounded Description
Mahmoud Ahmed, Junjie Fei, Jian Ding et al.
Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing
Yudong Liu, Jingwei Sun, Yueqian Lin et al.
kh: Symmetry Understanding of 3D Shapes via Chirality Disentanglement
Weikang Wang, Tobias Weißberg, Nafie El Amrani et al.
KinMo: Kinematic-aware Human Motion Understanding and Generation
Pengfei Zhang, Pinxin Liu, Pablo Garrido et al.
Knowledge Distillation for Learned Image Compression
Yunuo Chen, Zezheng Lyu, Bing He et al.