Shiwei Liu
45 papers · 2020–2025 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (13) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (5) π Conference Polyglot (9)
π
Renaissance Researcher
(5)
π
Conference Polyglot
(9)
π
Academic Marathon
(5)
π¬
Deep Specialist
(10)
π
Keyword Champion
(2)
π€
Dynamic Duo
(18)
π
Triple Crown
π
Grand Slam
ποΈ
Keyword Collector
(111)
β
The Questioner
(4)
β‘
Prolific Year
(14)
π
Trend Setter
π
Century Club
(45)
π₯
Unstoppable
(6)
Conferences
ICML (13)
ICLR (12)
NIPS (10)
AAAI (3)
EMNLP (2)
INTERSPEECH (2)
ACL (1)
ICCV (1)
IJCAI (1)
Top co-authors
Keywords
model compression
(11)
neural network pruning
(7)
sparse neural network
(4)
network pruning
(4)
dynamic sparse training
(3)
lottery ticket hypothesis
(3)
neural network optimization
(3)
model pruning
(3)
large language model
(3)
dynamic sparsity
(2)
pruning at initialization
(2)
iterative magnitude pruning
(2)
inference acceleration
(2)
convolutional neural network
(2)
neural network sparsification
(2)
sparse training
(2)
sparse network
(2)
data augmentation
(1)
knowledge distillation
(1)
stochastic gradient descent
(1)
Papers
SIDE: Socially Informed Drought Estimation Toward Understanding Societal Impact Dynamics of Environmental Crisis
AAAI 2025
Outlier-weighed Layerwise Sampling for LLM Fine-tuning
ACL 2025
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
ICLR 2025
Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More
ICML 2025
Composable Interventions for Language Models
ICLR 2025
From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications
ICML 2025
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
ICLR 2025
LIFT the Veil for the Truth: Principal Weights Emerge after Rank Reduction for Reasoning-Focused Supervised Fine-Tuning
ICML 2025
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective
AAAI 2025
AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
NIPS 2024
MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
INTERSPEECH 2024
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
NIPS 2024
E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation
NIPS 2024
Dynamic Data Pruning for Automatic Speech Recognition
INTERSPEECH 2024
AdaMerging: Adaptive Model Merging for Multi-Task Learning
ICLR 2024
NeurRev: Train Better Sparse Neural Network Practically via Neuron Revitalization
ICLR 2024
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
ICLR 2024
CaM: Cache Merging for Memory-efficient LLMs Inference
ICML 2024
Advancing Dynamic Sparse Training by Exploring Optimization Opportunities
ICML 2024
Sparse Cocktail: Every Sparse Pattern Every Sparse Ratio All At Once
ICML 2024
Junk DNA Hypothesis: Pruning Small Pre-Trained Weights $\textitIrreversibly$ and $\textitMonotonically$ Impairs βDifficult" Downstream Tasks in LLMs
ICML 2024
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
ICML 2024
FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping
EMNLP 2024
Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning
EMNLP 2024
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
ICLR 2023
The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
NIPS 2023
Predicting mutational effects on protein-protein binding via a side-chain diffusion probabilistic model
NIPS 2023
Donβt just prune by magnitude! Your mask topology is a secret weapon
NIPS 2023
Dynamic Sparsity Is Channel-Level Sparsity Learner
NIPS 2023
Towards Data-Agnostic Pruning At Initialization: What Makes a Good Sparse Mask?
NIPS 2023
Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost
AAAI 2023
Data Augmented Flatness-aware Gradient Projection for Continual Learning
ICCV 2023
More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity
ICLR 2023
REVISITING PRUNING AT INITIALIZATION THROUGH THE LENS OF RAMANUJAN GRAPH
ICLR 2023
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
ICLR 2023
Are Large Kernels Better Teachers than Transformers for ConvNets?
ICML 2023
Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
ICML 2023
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
ICML 2023
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity
ICLR 2022
The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training
ICLR 2022
Dynamic Sparse Network for Time Series Classification: Learning What to βSeeβ
NIPS 2022
Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training
ICML 2021
Selfish Sparse RNN Training
ICML 2021
Sparse Training via Boosting Pruning Plasticity with Neuroregeneration
NIPS 2021
Learning Sparse Neural Networks for Better Generalization
IJCAI 2020