conftrace_

Shuming Ma

57 papers · 2017–2025 · 11 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+16 more ↓

🧭 Keyword Pioneer 🌍 Conference Polyglot (11) 🗺️ Taxonomy Completionist (11) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (8)

🌉 Interdisciplinary Bridge 🏃 Academic Marathon (8) 🗺️ Taxonomy Completionist (11) 🏠 Conference Loyalist (20) 🤝 Dynamic Duo (29) 👑 Triple Crown 🏆 Grand Slam 🔬 Deep Specialist (19) 🧬 Topic Evolution 🔥 Unstoppable (9) 🚀 Conference Pioneer ⚡ Prolific Year (12) ❓ The Questioner (5) 🗃️ Keyword Collector (229) 💎 Century Club (57) 📈 Trend Setter

Conferences

ACL (20) EMNLP (10) COLING (5) NIPS (5) IJCAI (4) AAAI (3) NAACL (3) ICLR (2) ICML (2) IJCNLP (2) JMLR (1)

Top co-authors

Furu Wei (29) Dongdong Zhang (24) Xu Sun (17) Li Dong (16) Shaohan Huang (13) Jian Yang (12) Zhoujun Li (10) Houfeng Wang (8) XIA SONG (7) Haoyang Huang (7)

Research topics

Applications (1)

Keywords

neural machine translation (11) cross-lingual transfer (7) language model (6) multilingual neural machine translation (6) sequence-to-sequence model (5) machine translation (5) abstractive summarization (5) zero-shot learning (5) multilingual model (4) large language model (4) knowledge distillation (4) text summarization (4) multimodal learning (3) document-level translation (3) cross-lingual language model (3) mixture of expert (3) multilingual translation (3) model compression (3) multi-label classification (3) transformer architecture (3)

Papers

BitNet: 1-bit Pre-training for Large Language Models JMLR 2025 Bitnet.cpp: Efficient Edge Inference for Ternary LLMs ACL 2025 You Only Cache Once: Decoder-Decoder Architectures for Language Models NIPS 2024 Multi-Head Mixture-of-Experts NIPS 2024 Grounding Multimodal Large Language Models to the World ICLR 2024 On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation ACL 2023 Are More Layers Beneficial to Graph Transformers? ICLR 2023 TRIP: Accelerating Document-level Multilingual Pre-training via Triangular Document-level Pre-training on Parallel Data Triplets EMNLP 2023 Magneto: A Foundation Transformer ICML 2023 On the Pareto Front of Multilingual Neural Machine Translation NIPS 2023 Language Is Not All You Need: Aligning Perception with Language Models NIPS 2023 Discourse-Centric Evaluation of Document-level Machine Translation with a New Densely Annotated Parallel Corpus of Novels ACL 2023 GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator ACL 2023 A Length-Extrapolatable Transformer ACL 2023 Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers ACL 2023 XLM-E: Cross-lingual Language Model Pre-training via ELECTRA ACL 2022 StableMoE: Stable Routing Strategy for Mixture of Experts ACL 2022 CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual Labeled Sequence Translation EMNLP 2022 On the Representation Collapse of Sparse Mixture of Experts NIPS 2022 Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt EMNLP 2022 PAEG: Phrase-level Adversarial Example Generation for Neural Machine Translation COLING 2022 BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation NAACL 2022 High-resource Language-specific Training for Multilingual Neural Machine Translation IJCAI 2022 UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation IJCAI 2022 A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model IJCAI 2022 Towards Making the Most of Cross-Lingual Transfer for Zero-Shot Neural Machine Translation ACL 2022 Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task EMNLP 2021 Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders EMNLP 2021 mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs EMNLP 2021 Smart-Start Decoding for Neural Machine Translation NAACL 2021 How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation? IJCNLP 2021 Multilingual Agreement for Multilingual Neural Machine Translation IJCNLP 2021 Multilingual Agreement for Multilingual Neural Machine Translation ACL 2021 How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation? ACL 2021 Improving Multilingual Neural Machine Translation with Auxiliary Source Languages EMNLP 2021 Alternating Language Modeling for Cross-Lingual Pre-Training AAAI 2020 A Simple and Effective Unified Encoder for Document-Level Machine Translation ACL 2020 Improving Neural Machine Translation with Soft Template Prediction ACL 2020 Group, Extract and Aggregate: Summarizing a Large Amount of Finance News for Forex Movement Prediction EMNLP 2019 A Deep Reinforced Sequence-to-Set Model for Multi-Label Classification ACL 2019 Key Fact as Pivot: A Two-Stage Model for Low Resource Table-to-Text Generation ACL 2019 LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts AAAI 2019 Hierarchical Encoder with Auxiliary Supervision for Neural Table-to-Text Generation: Learning Better Representation for Tables AAAI 2019 Deconvolution-Based Global Decoding for Neural Machine Translation COLING 2018 Does Higher Order LSTM Have Better Accuracy for Segmenting and Labeling Sequence Data? COLING 2018 SGM: Sequence Generation Model for Multi-label Classification COLING 2018 A Neural Question Answering Model Based on Semi-Structured Tables COLING 2018 A Hierarchical End-to-End Model for Jointly Improving Text Summarization and Sentiment Classification IJCAI 2018 Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation NAACL 2018 Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization ACL 2018 Automatic Academic Paper Rating Based on Modularized Hierarchical Convolutional Neural Network ACL 2018 Bag-of-Words as Target for Neural Machine Translation ACL 2018 Global Encoding for Abstractive Summarization ACL 2018 Semantic-Unit-Based Dilated Convolution for Multi-Label Text Classification EMNLP 2018 Phrase-level Self-Attention Networks for Universal Sentence Encoding EMNLP 2018 meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting ICML 2017 Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization ACL 2017