Papers
4,428 papers found
Diffusion Noise Optimization for Synthetic VLM Training
Ren Ohkubo, Rintaro Yanagi, Hirokatsu Kataoka et al.
Digital Forensic AI You Can Explain: A Case Study on Video Source Camera Identification
Maryna Veksler, Kemal Akkaya, Selcuk Uluagac
DirectDrag: High-Fidelity, Mask-Free, Prompt-Free Drag-based Image Editing via Readout-Guided Feature Alignment
Sheng-Hao Liao, Shang-Fu Chen, Tai-Ming Huang et al.
Direct Visual Grounding by Directing Attention of Visual Tokens
Parsa Esmaeilkhani, Longin Jan Latecki
DiRe: Diversity-promoting Regularization for Dataset Condensation
Saumyaranjan Mohanty, Aravind Reddy, Konda Reddy Mopuri
Discrete Facial Encoding: A Framework for Data-driven Facial Display Discovery
Minh Tran, Maksim Siniukov, Zhangyu Jin et al.
Disentangle and Regularize: Sign Language Production with Articulator-Based Disentanglement and Channel-Aware Regularization
Sümeyye Meryem Taşyürek, Tuğçe Kızıltepe, Hacer Yalim Keles
Distilling Diversity and Control in Diffusion Models
Rohit Gandikota, David Bau
Distilling Offline Action Detection Models into Real-Time Streaming Models
Deep Patel, Yasunori Babazaki, Yasuto Nagase et al.
Distilling What and Why: Enhancing Driver Intention Prediction with MLLMs
Sainithin Artham, Avijit Dasgupta, Shankar Gangisetty et al.
Distribution Highlighted Reference-based Label Distribution Learning for Facial Age Estimation
Satoshi Suzuki, Shin'ya Yamaguchi, Shoichiro Takeda et al.
DiT-VTON: Diffusion Transformer Framework for Unified Multi-Category Virtual Try-On and Virtual Try-All with Integrated Image Editing
Qi Li, Shuwen Qiu, Kee Kiat Koo et al.
Diverse Sketch Colorization with Content-Enhanced Style Representation and Recolorization Distillation
Shuangming Mao, Haixiang Zhu
Diversity Preserving Coresets for Image Quality Assessment
Arpita Nema, Hanwei Zhu, Xi Zhang et al.
Divide and Refine: Enhancing Multimodal Representation and Explainability for Emotion Recognition in Conversation
Anh-Tuan Mai, Cam-Van Thi Nguyen, Duc-Trong Le
DM3Net: Dual-Camera Super-Resolution via Domain Modulation and Multi-scale Matching
Cong Guan, Jiacheng Ying, Yuya Ieiri et al.
DMAT: An End-to-End Framework for Joint Atmospheric Turbulence Mitigation and Object Detection
Paul Hill, Zhiming Liu, Alin Achim et al.
DMS2F-HAD: A Dual-branch Mamba-based Spatial-Spectral Fusion Network for Hyperspectral Anomaly Detection
Aayushma Pant, Lakpa Tamang, Tsz-Kwan Lee et al.
DNA: Dual-branch Network with Adaptation for Open-Set Online Handwriting Generation
Tsai-Ling Huang, Nhat-Tuong Do-Tran, Ngoc-Hoang-Lam Le et al.
DocWaveDiff: A Predict-and-Refine approach for Document Image Enhancement with Wavelet U-Nets and Diffusion models
Matteo Marulli, Marco Bertini
DODA: Adapting Object Detectors to Dynamic Agricultural Environments in Real-Time with Diffusion
Shuai Xiang, Pieter M. Blok, James Burridge et al.
Do Generative Video Models Understand Physical Principles?
Saman Motamed, Laura Culp, Kevin Swersky et al.
Domain Generalizing DINO for Visual Regression via Latent Distractor Subspace Consistency
Nikhil Reddy, Chetan Arora, Mahsa Baktashmotlagh
DOODLE: Diffusion-based Out-of-Distribution Learning for Open-set LiDAR Semantic Segmentation
Changgyoon Oh, Hyeonseong Kim, Daehyun We et al.
DoTA: Latent Distribution Conditioned Data Attribution for Diffusion Models
Ninad Joshi, Vivek Srivastava, Shirish Karande