Dimitris Papailiopoulos

43 papers · 2013–2025 · 7 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (12) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (7)

🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (12) 🌈 Renaissance Researcher (7) 🤝 Dynamic Duo (15) 👑 Triple Crown 🏆 Keyword Champion 💎 Century Club (43) 🚀 Conference Pioneer 🔥 Unstoppable (8) ⚡ Prolific Year (5) ❓ The Questioner (4) 📈 Trend Setter 🗃️ Keyword Collector (154)

Conferences

ICML (16) NIPS (14) AISTATS (5) ICLR (5) ACL (1) COLT (1) EMNLP (1)

Top co-authors

Kangwook Lee (15) Shashank Rajput (12) Jy-yong Sohn (7) Hongyi Wang (7) Zachary Charles (6) Kartik Sreenivasan (6) Angeliki Giannou (5) Samet Oymak (5) Kannan Ramchandran (4) Megasthenis Asteris (4)

Keywords

stochastic gradient descent (3) in-context learning (3) relu network (3) neural network pruning (3) approximation algorithm (3) combinatorial optimization (3) distributed training (3) generalization bound (2) eigenvalue decay (2) adversarial perturbation (2) sparse pca (2) model compression (2) graph clustering (2) lottery ticket hypothesis (2) distributed learning (2) attention mechanism (2) data augmentation (2) gradient aggregation (2) sparse principal component analysis (2) correlation clustering (2)

Papers

Lexico: Extreme KV Cache Compression via Sparse Coding over Universal Dictionaries ICML 2025 How Well Can Transformers Emulate In-Context Newton’s Method? AISTATS 2025 Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges ICML 2025 From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data ICLR 2025 VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data ICML 2025 Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition ICML 2025 Can Mamba Learn How To Learn? A Comparative Study on In-Context Learning Tasks ICML 2024 Looped Transformers are Better at Learning Learning Algorithms ICLR 2024 Teaching Arithmetic to Small Transformers ICLR 2024 CHAI: Clustered Head Attention for Efficient LLM Inference ICML 2024 Prompted LLMs as Chatbot Modules for Long Open-domain Conversation ACL 2023 The Expressive Power of Tuning Only the Normalization Layers COLT 2023 Transformers as Algorithms: Generalization and Stability in In-context Learning ICML 2023 Looped Transformers as Programmable Computers ICML 2023 Dissecting Chain-of-Thought: Compositionality through In-Context Filtering and Learning NIPS 2023 Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment EMNLP 2022 LIFT: Language-Interfaced Fine-Tuning for Non-language Machine Learning Tasks NIPS 2022 Rare Gems: Finding Lottery Tickets at Initialization NIPS 2022 Finding Nearly Everything within Random Binary Networks AISTATS 2022 Permutation-Based SGD: Is Random Optimal? ICLR 2022 GenLabel: Mixup Relabeling using Generative Models ICML 2022 An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks NIPS 2021 Closing the convergence gap of SGD without replacement ICML 2020 Optimal Lottery Tickets via Subset Sum: Logarithmic Over-Parameterization is Sufficient NIPS 2020 Bad Global Minima Exist and SGD Can Reach Them NIPS 2020 Federated Learning with Matched Averaging ICLR 2020 Attack of the Tails: Yes, You Really Can Backdoor Federated Learning NIPS 2020 Does Data Augmentation Lead to Positive Margin? ICML 2019 DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation NIPS 2019 A Geometric Perspective on the Transferability of Adversarial Directions AISTATS 2019 Gradient Diversity: a Key Ingredient for Scalable Distributed Learning AISTATS 2018 Stability and Generalization of Learning Algorithms that Converge to Global Optima ICML 2018 DRACO: Byzantine-resilient Distributed Training via Redundant Gradients ICML 2018 The Effect of Network Width on the Performance of Large-batch Training NIPS 2018 ATOMO: Communication-efficient Learning via Atomic Sparsification NIPS 2018 Cyclades: Conflict-free Asynchronous Machine Learning NIPS 2016 Bipartite Correlation Clustering: Maximizing Agreements AISTATS 2016 Sparse PCA via Bipartite Matchings NIPS 2015 Parallel Correlation Clustering on Big Graphs NIPS 2015 Orthogonal NMF through Subspace Exploration NIPS 2015 Finding Dense Subgraphs via Low-Rank Bilinear Optimization ICML 2014 Nonnegative Sparse PCA with Provable Guarantees ICML 2014 Sparse PCA through Low-rank Approximations ICML 2013