Mohammad Shoeybi

34 papers · 2017–2026 · 9 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🏃 Academic Marathon (8) 🐝 Cross-Pollinator (11) 🌍 Conference Polyglot (9) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7)

🌈 Renaissance Researcher (7) 🗺️ Taxonomy Completionist (58) 🧭 Keyword Pioneer 🤝 Dynamic Duo (29) 🔬 Deep Specialist (12) 🏆 Keyword Champion (3) 🧬 Topic Evolution 👑 Triple Crown 🔥 Unstoppable (7) 🗃️ Keyword Collector (118) ❓ The Questioner ⚡ Prolific Year (9) 📈 Trend Setter 💎 Century Club (33)

Conferences

EMNLP (9) NIPS (6) ACL (5) ICLR (5) EACL (3) ICML (3) CVPR (1) ICCV (1) IJCNLP (1)

Top co-authors

Bryan Catanzaro (30) Wei Ping (18) Mostofa Patwary (18) Peng Xu (10) Shrimai Prabhumoye (8) Zihan Liu (8) Anima Anandkumar (6) Raul Puri (4) Bo Li (4) Boxin Wang (4)

Keywords

large language model (9) language model (8) instruction tuning (5) text generation (5) retrieval-augmented generation (4) question answering (4) domain adaptation (3) toxicity reduction (3) pretraining dataset (2) benchmark evaluation (2) open-domain question answering (2) knowledge distillation (2) factual accuracy (2) reward modeling (2) unsupervised pretraining (2) end-to-end training (2) language modeling (2) dialogue generation (2) in-context learning (2) synthetic datum (2)

Papers

Nemotron-CrossThink: Scaling Self-Learning beyond Math Reasoning EACL 2026 MIND: Math Informed syNthetic Dialogues for Pretraining LLMs ICLR 2025 MM-EMBED: UNIVERSAL MULTIMODAL RETRIEVAL WITH MULTIMODAL LLMS ICLR 2025 NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models ICLR 2025 Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset ACL 2025 AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling ACL 2025 ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities ICLR 2025 InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining ICML 2024 ChatQA: Surpassing GPT-4 on Conversational QA and RAG NIPS 2024 LLM-Evolve: Evaluation for LLM’s Evolving Capability on Benchmarks EMNLP 2024 ODIN: Disentangled Reward Mitigates Hacking in RLHF ICML 2024 VILA: On Pre-training for Visual Language Models CVPR 2024 Data, Data Everywhere: A Guide for Pretraining Dataset Construction EMNLP 2024 Retrieval meets Long Context Large Language Models ICLR 2024 RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs NIPS 2024 Compact Language Models via Pruning and Knowledge Distillation NIPS 2024 Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study EMNLP 2023 Adding Instructions during Pretraining: Effective way of Controlling Toxicity in Language Models EACL 2023 Context Generation Improves Open Domain Question Answering EACL 2023 Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning EMNLP 2023 Evaluating Parameter Efficient Learning for Generation EMNLP 2022 Factuality Enhanced Language Models for Open-Ended Text Generation NIPS 2022 Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models NIPS 2022 Multi-Stage Prompting for Knowledgeable Dialogue Generation ACL 2022 Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models EMNLP 2022 Long-Short Transformer: Efficient Transformers for Language and Vision NIPS 2021 End-to-End Training of Neural Retrievers for Open-Domain Question Answering IJCNLP 2021 End-to-End Training of Neural Retrievers for Open-Domain Question Answering ACL 2021 MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models EMNLP 2020 BioMegatron: Larger Biomedical Domain Language Model EMNLP 2020 Training Question Answering Models From Synthetic Data EMNLP 2020 Large Scale Multi-Actor Generative Dialog Modeling ACL 2020 Unsupervised Video Interpolation Using Cycle Consistency ICCV 2019 Deep Voice: Real-time Neural Text-to-Speech ICML 2017