Weizhu Chen
100 papers · 2014–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (11) πΊοΈ Taxonomy Completionist (11) π Interdisciplinary Bridge π Academic Marathon (11)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(11)
π§
Keyword Pioneer
π
Conference Loyalist
(20)
π€
Dynamic Duo
(35)
π
Triple Crown
π
Grand Slam
π±
Topic Pioneer
π¬
Deep Specialist
(24)
π§¬
Topic Evolution
π
Keyword Champion
(2)
π
Trend Setter
β
The Questioner
(2)
ποΈ
Keyword Collector
(297)
π
Century Club
(98)
π₯
Unstoppable
(8)
π
Conference Pioneer
β‘
Prolific Year
(25)
Conferences
ACL (24)
ICLR (20)
EMNLP (17)
ICML (9)
NIPS (9)
IJCNLP (8)
NAACL (7)
AAAI (3)
ECCV (1)
IJCAI (1)
JMLR (1)
Top co-authors
Keywords
large language model
(13)
model compression
(10)
pre-trained language model
(10)
language model
(10)
few-shot learning
(9)
text generation
(7)
question answering
(6)
natural language understanding
(5)
open-domain question answering
(4)
zero-shot learning
(4)
knowledge distillation
(4)
code generation
(4)
retrieval-augmented generation
(4)
in-context learning
(3)
contrastive learning
(3)
multi-task learning
(3)
lottery ticket hypothesis
(3)
low-rank adaptation
(3)
neural network pruning
(3)
adversarial training
(3)
Papers
Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability
ACL 2026
Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions
ACL 2026
MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning
AAAI 2025
LongRoPE2: Near-Lossless LLM Context Window Scaling
ICML 2025
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
ICLR 2025
Key-Point-Driven Data Synthesis with Its Enhancement on Mathematical Reasoning
AAAI 2025
Make Your LLM Fully Utilize the Context
NIPS 2024
WizardArena: Post-training Large Language Models via Simulated Offline Chatbot Arena
NIPS 2024
Language Models can be Deductive Solvers
NAACL 2024
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
NAACL 2024
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators
NAACL 2024
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving
ICLR 2024
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing
ICLR 2024
Supervised Knowledge Makes Large Language Models Better In-context Learners
ICLR 2024
LoftQ: LoRA-Fine-Tuning-aware Quantization for Large Language Models
ICLR 2024
Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective
ICLR 2024
Can LLMs Learn From Mistakes? An Empirical Study on Reasoning Tasks
EMNLP 2024
Automatic Instruction Evolving for Large Language Models
EMNLP 2024
Competition-Level Problems are Effective LLM Evaluators
ACL 2024
Not All Tokens Are What You Need for Pretraining
NIPS 2024
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
ICML 2023
Meet in the Middle: A New Pre-training Paradigm
NIPS 2023
In-Context Learning Unlocked for Diffusion Models
NIPS 2023
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation
NIPS 2023
Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models
NIPS 2023
Making Language Models Better Reasoners with Step-Aware Verifier
ACL 2023
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models
ACL 2023
Code Execution with Pre-trained Language Models
ACL 2023
Joint Generator-Ranker Learning for Natural Language Generation
ACL 2023
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation
EMNLP 2023
Skill-Based Few-Shot Selection for In-Context Learning
EMNLP 2023
Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy
EMNLP 2023
Diffusion-GAN: Training GANs with Diffusion
ICLR 2023
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
ICLR 2023
Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning
ICLR 2023
CodeT: Code Generation with Generated Tests
ICLR 2023
Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial Auto-Encoders
ICLR 2023
Less is More: Task-aware Layer-wise Distillation for Language Model Compression
ICML 2023
Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise
ICML 2023
HyperTuning: Toward Adapting Large Language Models without Back-propagation
ICML 2023
Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models
ICML 2023
OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering
NAACL 2022
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation
NAACL 2022
ALLSH: Active Learning Guided by Local Sensitivity and Hardness
NAACL 2022
Finding the Dominant Winning Ticket in Pre-Trained Language Models
ACL 2022
Scalable Learning to Optimize: A Learned Optimizer Can Train Big Models
ECCV 2022
DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation
ACL 2022
A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models
ACL 2022
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
ACL 2022
What Makes Good In-Context Examples for GPT-3?
ACL 2022
Soft-Labeled Contrastive Pre-Training for Function-Level Code Representation
EMNLP 2022
Reasoning Like Program Executors
EMNLP 2022
CodeRetriever: A Large Scale Contrastive Pre-Training Method for Code Search
EMNLP 2022
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
ICML 2022
CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing
ACL 2022
CERT: Continual Pre-training on Sketches for Library-oriented Code Generation
IJCAI 2022
XLM-K: Improving Cross-Lingual Language Model Pre-training with Multilingual Knowledge
AAAI 2022
Adversarial Retriever-Ranker for Dense Text Retrieval
ICLR 2022
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models
ICLR 2022
LoRA: Low-Rank Adaptation of Large Language Models
ICLR 2022
TAPEX: Table Pre-training via Learning a Neural SQL Executor
ICLR 2022
Controllable Natural Language Generation with Contrastive Prefixes
ACL 2022
HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalizability
ACL 2021
Token-wise Curriculum Learning for Neural Machine Translation
EMNLP 2021
ARCH: Efficient Adversarial Regularized Training with Caching
EMNLP 2021
GLGE: A New General Language Generation Evaluation Benchmark
IJCNLP 2021
Memory-Efficient Differentiable Transformer Architecture Search
IJCNLP 2021
DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION
ICLR 2021
MixKD: Towards Efficient Distillation of Large-scale Language Models
ICLR 2021
CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding
ICLR 2021
Memory-Efficient Differentiable Transformer Architecture Search
ACL 2021
GLGE: A New General Language Generation Evaluation Benchmark
ACL 2021
Reader-Guided Passage Reranking for Open-Domain Question Answering
ACL 2021
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization
ACL 2021
Finetuning Pretrained Transformers into RNNs
EMNLP 2021
Generation-Augmented Retrieval for Open-Domain Question Answering
ACL 2021
BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining
ICML 2021
Poolingformer: Long Document Modeling with Pooling Attention
ICML 2021
Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
NIPS 2021
UnitedQA: A Hybrid Approach for Open Domain Question Answering
ACL 2021
UnitedQA: A Hybrid Approach for Open Domain Question Answering
IJCNLP 2021
Generation-Augmented Retrieval for Open-Domain Question Answering
IJCNLP 2021
HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalizability
IJCNLP 2021
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization
IJCNLP 2021
Reader-Guided Passage Reranking for Open-Domain Question Answering
IJCNLP 2021
Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach
EMNLP 2021
Few-Shot Named Entity Recognition: An Empirical Baseline Study
EMNLP 2021
On the Variance of the Adaptive Learning Rate and Beyond
ICLR 2020
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
ACL 2020
Understanding the Difficulty of Training Transformers
EMNLP 2020
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning
EMNLP 2020
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
ACL 2020
DSCOVR: Randomized Primal-Dual Block Coordinate Algorithms for Asynchronous Distributed Optimization
JMLR 2019
Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering
NAACL 2019
Parameter-free Sentence Embedding via Orthogonal Basis
EMNLP 2019
A Hybrid Neural Network Model for Commonsense Reasoning
EMNLP 2019
Multi-Task Deep Neural Networks for Natural Language Understanding
ACL 2019
Parameter-free Sentence Embedding via Orthogonal Basis
IJCNLP 2019
FusionNet: Fusing via Fully-aware Attention with Application to Machine Comprehension
ICLR 2018
Large-scale L-BFGS using MapReduce
NIPS 2014