Papers
What Can Be Learnt With Wide Convolutional Neural Networks?
Francesco Cagnetta, Alessandro Favero, Matthieu Wyart
What can online reinforcement learning with function approximation benefit from general coverage conditions?
Fanghui Liu, Luca Viano, Volkan Cevher
What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective
Rhea Chowers, Yair Weiss
What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?
Rui Yang, Lin Yong, Xiaoteng Ma et al.
What Makes Entities Similar? A Similarity Flooding Perspective for Multi-sourced Knowledge Graph Embeddings
Zequn Sun, Jiacheng Huang, Xiaozhou Xu et al.
When and How Does Known Class Help Discover Unknown Ones? Provable Understanding Through Spectral Analysis
Yiyou Sun, Zhenmei Shi, Yingyu Liang et al.
When does Privileged information Explain Away Label Noise?
Guillermo Ortiz-Jimenez, Mark Collier, Anant Nawalgaria et al.
When do Minimax-fair Learning and Empirical Risk Minimization Coincide?
Harvineet Singh, Matthäus Kleindessner, Volkan Cevher et al.
When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction
Vinith Menon Suriyakumar, Marzyeh Ghassemi, Berk Ustun
When Sparsity Meets Contrastive Models: Less Graph Data Can Bring Better Class-Balanced Representations
Chunhui Zhang, Chao Huang, Yijun Tian et al.
Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression
Yihao Xue, Siddharth Joshi, Eric Gan et al.
Which Invariance Should We Transfer? A Causal Minimax Learning Approach
Mingzhou Liu, Xiangyu Zheng, Xinwei Sun et al.
Which is Better for Learning with Noisy Labels: The Semi-supervised Method or Modeling Label Noise?
Yu Yao, Mingming Gong, Yuxuan Du et al.
Which Tricks are Important for Learning to Rank?
Ivan Lyzhin, Aleksei Ustimenko, Andrey Gulin et al.
Who Needs to Know? Minimal Knowledge for Optimal Coordination
Niklas Lauffer, Ameesh Shah, Micah Carroll et al.
Whose Opinions Do Language Models Reflect?
Shibani Santurkar, Esin Durmus, Faisal Ladhak et al.
"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts
Haoran Zhang, Harvineet Singh, Marzyeh Ghassemi et al.
Why does Throwing Away Data Improve Worst-Group Error?
Kamalika Chaudhuri, Kartik Ahuja, Martin Arjovsky et al.
Why do Nearest Neighbor Language Models Work?
Frank F. Xu, Uri Alon, Graham Neubig
Why Is Public Pretraining Necessary for Private Model Training?
Arun Ganesh, Mahdi Haghifam, Milad Nasr et al.
Why Random Pruning Is All We Need to Start Sparse
Advait Harshal Gadhikar, Sohom Mukherjee, Rebekka Burkholz
Why Target Networks Stabilise Temporal Difference Methods
Mattie Fellows, Matthew J. A. Smith, Shimon Whiteson
Width and Depth Limits Commute in Residual Networks
Soufiane Hayou, Greg Yang
WL meet VC
Christopher Morris, Floris Geerts, Jan Tönshoff et al.