Shiwei Liu

45 papers · 2020–2025 · 9 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🗺️ Taxonomy Completionist (13) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (9)

🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (9) 🏃 Academic Marathon (5) 🔬 Deep Specialist (10) 🏆 Keyword Champion (2) 🤝 Dynamic Duo (18) 👑 Triple Crown 🏆 Grand Slam 🗃️ Keyword Collector (111) ❓ The Questioner (4) ⚡ Prolific Year (14) 📈 Trend Setter 💎 Century Club (45) 🔥 Unstoppable (6)

Conferences

ICML (13) ICLR (12) NIPS (10) AAAI (3) EMNLP (2) INTERSPEECH (2) ACL (1) ICCV (1) IJCAI (1)

Top co-authors

Lu Yin (18) Zhangyang Wang (17) Tianlong Chen (14) Mykola Pechenizkiy (14) Ajay Kumar Jaiswal (10) Decebal Constantin Mocanu (9) Li Shen (8) Tianjin Huang (8) Zhenyu Zhang (8) Qiao Xiao (5)

Keywords

model compression (11) neural network pruning (7) sparse neural network (4) network pruning (4) dynamic sparse training (3) lottery ticket hypothesis (3) neural network optimization (3) model pruning (3) large language model (3) dynamic sparsity (2) pruning at initialization (2) iterative magnitude pruning (2) inference acceleration (2) convolutional neural network (2) neural network sparsification (2) sparse training (2) sparse network (2) data augmentation (1) knowledge distillation (1) stochastic gradient descent (1)

Papers

SIDE: Socially Informed Drought Estimation Toward Understanding Societal Impact Dynamics of Environmental Crisis AAAI 2025 Outlier-weighed Layerwise Sampling for LLM Fine-tuning ACL 2025 Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN ICLR 2025 Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More ICML 2025 Composable Interventions for Language Models ICLR 2025 From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications ICML 2025 SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training ICLR 2025 LIFT the Veil for the Truth: Principal Weights Emerge after Rank Reduction for Reasoning-Focused Supervised Fine-Tuning ICML 2025 Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective AAAI 2025 AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models NIPS 2024 MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization INTERSPEECH 2024 Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding NIPS 2024 E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation NIPS 2024 Dynamic Data Pruning for Automatic Speech Recognition INTERSPEECH 2024 AdaMerging: Adaptive Model Merging for Multi-Task Learning ICLR 2024 NeurRev: Train Better Sparse Neural Network Practically via Neuron Revitalization ICLR 2024 Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs ICLR 2024 CaM: Cache Merging for Memory-efficient LLMs Inference ICML 2024 Advancing Dynamic Sparse Training by Exploring Optimization Opportunities ICML 2024 Sparse Cocktail: Every Sparse Pattern Every Sparse Ratio All At Once ICML 2024 Junk DNA Hypothesis: Pruning Small Pre-Trained Weights $\textitIrreversibly$ and $\textitMonotonically$ Impairs “Difficult" Downstream Tasks in LLMs ICML 2024 Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity ICML 2024 FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping EMNLP 2024 Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning EMNLP 2024 Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers ICLR 2023 The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter NIPS 2023 Predicting mutational effects on protein-protein binding via a side-chain diffusion probabilistic model NIPS 2023 Don’t just prune by magnitude! Your mask topology is a secret weapon NIPS 2023 Dynamic Sparsity Is Channel-Level Sparsity Learner NIPS 2023 Towards Data-Agnostic Pruning At Initialization: What Makes a Good Sparse Mask? NIPS 2023 Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost AAAI 2023 Data Augmented Flatness-aware Gradient Projection for Continual Learning ICCV 2023 More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity ICLR 2023 REVISITING PRUNING AT INITIALIZATION THROUGH THE LENS OF RAMANUJAN GRAPH ICLR 2023 Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together! ICLR 2023 Are Large Kernels Better Teachers than Transformers for ConvNets? ICML 2023 Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication ICML 2023 Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models ICML 2023 Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity ICLR 2022 The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training ICLR 2022 Dynamic Sparse Network for Time Series Classification: Learning What to “See” NIPS 2022 Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training ICML 2021 Selfish Sparse RNN Training ICML 2021 Sparse Training via Boosting Pruning Plasticity with Neuroregeneration NIPS 2021 Learning Sparse Neural Networks for Better Generalization IJCAI 2020