Shuming Ma
57 papers · 2017–2025 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (11) πΊοΈ Taxonomy Completionist (11) π Interdisciplinary Bridge π Academic Marathon (8)
π
Interdisciplinary Bridge
π
Academic Marathon
(8)
πΊοΈ
Taxonomy Completionist
(11)
π
Conference Loyalist
(20)
π€
Dynamic Duo
(29)
π
Triple Crown
π
Grand Slam
π¬
Deep Specialist
(19)
π§¬
Topic Evolution
π₯
Unstoppable
(9)
π
Conference Pioneer
β‘
Prolific Year
(12)
β
The Questioner
(5)
ποΈ
Keyword Collector
(229)
π
Century Club
(57)
π
Trend Setter
Conferences
ACL (20)
EMNLP (10)
COLING (5)
NIPS (5)
IJCAI (4)
AAAI (3)
NAACL (3)
ICLR (2)
ICML (2)
IJCNLP (2)
JMLR (1)
Top co-authors
Research topics
Keywords
neural machine translation
(11)
cross-lingual transfer
(7)
language model
(6)
multilingual neural machine translation
(6)
sequence-to-sequence model
(5)
machine translation
(5)
abstractive summarization
(5)
zero-shot learning
(5)
multilingual model
(4)
large language model
(4)
knowledge distillation
(4)
text summarization
(4)
multimodal learning
(3)
document-level translation
(3)
cross-lingual language model
(3)
mixture of expert
(3)
multilingual translation
(3)
model compression
(3)
multi-label classification
(3)
transformer architecture
(3)
Papers
BitNet: 1-bit Pre-training for Large Language Models
JMLR 2025
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
ACL 2025
You Only Cache Once: Decoder-Decoder Architectures for Language Models
NIPS 2024
Multi-Head Mixture-of-Experts
NIPS 2024
Grounding Multimodal Large Language Models to the World
ICLR 2024
On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation
ACL 2023
Are More Layers Beneficial to Graph Transformers?
ICLR 2023
TRIP: Accelerating Document-level Multilingual Pre-training via Triangular Document-level Pre-training on Parallel Data Triplets
EMNLP 2023
Magneto: A Foundation Transformer
ICML 2023
On the Pareto Front of Multilingual Neural Machine Translation
NIPS 2023
Language Is Not All You Need: Aligning Perception with Language Models
NIPS 2023
Discourse-Centric Evaluation of Document-level Machine Translation with a New Densely Annotated Parallel Corpus of Novels
ACL 2023
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
ACL 2023
A Length-Extrapolatable Transformer
ACL 2023
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
ACL 2023
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
ACL 2022
StableMoE: Stable Routing Strategy for Mixture of Experts
ACL 2022
CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual Labeled Sequence Translation
EMNLP 2022
On the Representation Collapse of Sparse Mixture of Experts
NIPS 2022
Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
EMNLP 2022
PAEG: Phrase-level Adversarial Example Generation for Neural Machine Translation
COLING 2022
BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation
NAACL 2022
High-resource Language-specific Training for Multilingual Neural Machine Translation
IJCAI 2022
UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation
IJCAI 2022
A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model
IJCAI 2022
Towards Making the Most of Cross-Lingual Transfer for Zero-Shot Neural Machine Translation
ACL 2022
Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
EMNLP 2021
Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders
EMNLP 2021
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs
EMNLP 2021
Smart-Start Decoding for Neural Machine Translation
NAACL 2021
How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation?
IJCNLP 2021
Multilingual Agreement for Multilingual Neural Machine Translation
IJCNLP 2021
Multilingual Agreement for Multilingual Neural Machine Translation
ACL 2021
How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation?
ACL 2021
Improving Multilingual Neural Machine Translation with Auxiliary Source Languages
EMNLP 2021
Alternating Language Modeling for Cross-Lingual Pre-Training
AAAI 2020
A Simple and Effective Unified Encoder for Document-Level Machine Translation
ACL 2020
Improving Neural Machine Translation with Soft Template Prediction
ACL 2020
Group, Extract and Aggregate: Summarizing a Large Amount of Finance News for Forex Movement Prediction
EMNLP 2019
A Deep Reinforced Sequence-to-Set Model for Multi-Label Classification
ACL 2019
Key Fact as Pivot: A Two-Stage Model for Low Resource Table-to-Text Generation
ACL 2019
LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts
AAAI 2019
Hierarchical Encoder with Auxiliary Supervision for Neural Table-to-Text Generation: Learning Better Representation for Tables
AAAI 2019
Deconvolution-Based Global Decoding for Neural Machine Translation
COLING 2018
Does Higher Order LSTM Have Better Accuracy for Segmenting and Labeling Sequence Data?
COLING 2018
SGM: Sequence Generation Model for Multi-label Classification
COLING 2018
A Neural Question Answering Model Based on Semi-Structured Tables
COLING 2018
A Hierarchical End-to-End Model for Jointly Improving Text Summarization and Sentiment Classification
IJCAI 2018
Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation
NAACL 2018
Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization
ACL 2018
Automatic Academic Paper Rating Based on Modularized Hierarchical Convolutional Neural Network
ACL 2018
Bag-of-Words as Target for Neural Machine Translation
ACL 2018
Global Encoding for Abstractive Summarization
ACL 2018
Semantic-Unit-Based Dilated Convolution for Multi-Label Text Classification
EMNLP 2018
Phrase-level Self-Attention Networks for Universal Sentence Encoding
EMNLP 2018
meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting
ICML 2017
Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization
ACL 2017