Weijia Shi

43 papers · 2019–2026 · 10 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🌍 Conference Polyglot (10) 🏃 Academic Marathon (6) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🤝 Dynamic Duo (22) 👑 Triple Crown 🏆 Grand Slam 👥 Mega-Team (82) 🧬 Topic Evolution 💎 Century Club (42) 🚀 Conference Pioneer ⚡ Prolific Year (6) 📈 Trend Setter 🗃️ Keyword Collector (131) 🔥 Unstoppable (7) ❓ The Questioner

Conferences

ICLR (14) EMNLP (9) ACL (7) NIPS (4) IJCNLP (3) NAACL (2) AAAI (1) EACL (1) ICCV (1) ICML (1)

Top co-authors

Luke Zettlemoyer (22) Noah A. Smith (10) Wen-tau Yih (9) Muhao Chen (7) Yulia Tsvetkov (6) Mike Lewis (6) Sewon Min (6) Hannaneh Hajishirzi (5) Kai-Wei Chang (5) Shangbin Feng (4)

Research topics

Applications (1)

Keywords

large language model (10) zero-shot learning (5) language model (4) retrieval-augmented generation (3) knowledge base (3) question answering (3) information retrieval (3) sentence classification (2) distant supervision (2) abstractive summarization (2) entity linking (2) contrastive learning (2) hallucination mitigation (2) text generation (2) orthogonal transformation (2) instruction tuning (2) language modeling (2) knowledge graph embedding (2) bias mitigation (2) word embedding (2)

Papers

When One LLM Drools, Multi-LLM Collaboration Rules ACL 2026 BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval ICLR 2025 OLMoE: Open Mixture-of-Experts Language Models ICLR 2025 MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models ICLR 2025 s1: Simple test-time scaling EMNLP 2025 Fantastic Copyrighted Beasts and How (Not) to Generate Them ICLR 2025 MMTEB: Massive Multilingual Text Embedding Benchmark ICLR 2025 MUSE: Machine Unlearning Six-Way Evaluation for Language Models ICLR 2025 Instruction-tuned Language Models are Better Knowledge Learners ACL 2024 Don’t Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration ACL 2024 Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models NIPS 2024 Trusting Your Evidence: Hallucinate Less with Context-aware Decoding NAACL 2024 Teaching LLMs to Abstain across Languages via Multilingual Feedback EMNLP 2024 REPLUG: Retrieval-Augmented Black-Box Language Models NAACL 2024 RECOMP: Improving Retrieval-Augmented LMs with Context Compression and Selective Augmentation ICLR 2024 Scaling Retrieval-Based Language Models with a Trillion-Token Datastore NIPS 2024 In-Context Pretraining: Language Modeling Beyond Document Boundaries ICLR 2024 Detecting Pretraining Data from Large Language Models ICLR 2024 RA-DIT: Retrieval-Augmented Dual Instruction Tuning ICLR 2024 Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models ICLR 2024 SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore ICLR 2024 Lemur: Harmonizing Natural Language and Code for Language Agents ICLR 2024 Evaluating Copyright Takedown Methods for Language Models NIPS 2024 PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3 ICCV 2023 One Embedder, Any Task: Instruction-Finetuned Text Embeddings ACL 2023 Nonparametric Masked Language Modeling ACL 2023 RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering EMNLP 2023 Getting MoRE out of Mixture of Language Model Reasoning Experts EMNLP 2023 Toward Human Readable Prompt Tuning: Kubrick’s The Shining is a good movie, and a good prompt too? EMNLP 2023 Fine-Grained Human Feedback Gives Better Rewards for Language Model Training NIPS 2023 Selective Annotation Makes Language Models Better Few-Shot Learners ICLR 2023 Retrieval-Augmented Multimodal Language Modeling ICML 2023 Nearest Neighbor Zero-Shot Inference EMNLP 2022 DESCGEN: A Distantly Supervised Datasetfor Generating Entity Descriptions IJCNLP 2021 DESCGEN: A Distantly Supervised Datasetfor Generating Entity Descriptions ACL 2021 Cross-lingual Entity Alignment with Incidental Supervision EACL 2021 Design Challenges in Low-resource Cross-lingual Entity Linking EMNLP 2020 Retrofitting Contextualized Word Embeddings with Paraphrases IJCNLP 2019 Examining Gender Bias in Languages with Grammatical Gender IJCNLP 2019 Embedding Uncertain Knowledge Graphs AAAI 2019 Retrofitting Contextualized Word Embeddings with Paraphrases EMNLP 2019 Learning Bilingual Word Embeddings Using Lexical Definitions ACL 2019 Examining Gender Bias in Languages with Grammatical Gender EMNLP 2019