Papers
DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity
Baekrok Shin, Junsoo Oh, Hanseul Cho et al.
Data Acquisition via Experimental Design for Data Markets
Charles Lu, Baihe Huang, Sai Praneeth Karimireddy et al.
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Sheng-Yu Wang, Aaron Hertzmann, Alexei A. Efros et al.
Data Augmentation with Diffusion for Open-Set Semi-Supervised Learning
Seonghyun Ban, Heesan Kong, Kee-Eung Kim
DataComp-LM: In search of the next generation of training sets for language models
Jeffrey Li, Alex Fang, Georgios Smyrnis et al.
Data curation via joint example selection further accelerates multimodal learning
Talfan Evans, Nikhil Parthasarathy, Hamza Merzić et al.
Data Distribution Valuation
Xinyi Xu, Shuaiqi Wang, Chuan-Sheng Foo et al.
Data-Driven Discovery of Dynamical Systems in Pharmacology using Large Language Models
Samuel Holt, Zhaozhi Qian, Tennison Liu et al.
Data-Efficient Learning with Neural Programs
Alaia Solko-Breslin, Seewon Choi, Ziyang Li et al.
Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning
Wuyang Chen, Jialin Song, Pu Ren et al.
Data-faithful Feature Attribution: Mitigating Unobservable Confounders via Instrumental Variables
Qiheng Sun, Haocheng Xia, Jinfei Liu
Data Free Backdoor Attacks
Bochuan Cao, Jinyuan Jia, Chuxuan Hu et al.
Data Mixture Inference Attack: BPE Tokenizers Reveal Training Data Compositions
Jonathan Hayase, Alisa Liu, Yejin Choi et al.
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Edoardo Debenedetti, Javier Rando, Daniel Paleka et al.
Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
Hadi Pouransari, Chun-Liang Li, Jen-Hao Rick Chang et al.
DataStealing: Steal Data from Diffusion Models in Federated Learning with Multiple Trojans
Yuan Gan, Jiaxu Miao, Yi Yang
Data subsampling for Poisson regression with pth-root-link
Han Cheng Lie, Alexander Munteanu
DAT: Improving Adversarial Robustness via Generative Amplitude Mix-up in Frequency Domain
Fengpeng Li, Kemou Li, Haiwei Wu et al.
DCDepth: Progressive Monocular Depth Estimation in Discrete Cosine Domain
Kun Wang, Zhiqiang Yan, Junkai Fan et al.
DC-Gaussian: Improving 3D Gaussian Splatting for Reflective Dash Cam Videos
Linhan Wang, Kai Cheng, Shuo Lei et al.
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
Haoran Que, Jiaheng Liu, Ge Zhang et al.
DDGS-CT: Direction-Disentangled Gaussian Splatting for Realistic Volume Rendering
Zhongpai Gao, Benjamin Planche, Meng Zheng et al.
DDK: Distilling Domain Knowledge for Efficient Large Language Models
Jiaheng Liu, Chenchen Zhang, Jinyang Guo et al.
DDN: Dual-domain Dynamic Normalization for Non-stationary Time Series Forecasting
Tao Dai, Beiliang Wu, Peiyuan Liu et al.
DDR: Exploiting Deep Degradation Response as Flexible Image Descriptor
Juncheng Wu, Zhangkai Ni, Hanli Wang et al.