Chenyan Xiong

55 papers · 2013–2026 · 10 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🌉 Interdisciplinary Bridge 🏃 Academic Marathon (12) 🌈 Renaissance Researcher (8) 🌍 Conference Polyglot (10) 🗺️ Taxonomy Completionist (73)

🏃 Academic Marathon (12) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (8) 🔬 Deep Specialist (12) 🏆 Grand Slam 🤝 Dynamic Duo (18) 📈 Trend Setter 🔥 Unstoppable (8) 🗃️ Keyword Collector (198) 💎 Century Club (52) ⚡ Prolific Year (5)

Conferences

EMNLP (16) ACL (15) ICLR (7) NAACL (5) EACL (3) NIPS (3) AAAI (2) IJCNLP (2) COLING (1) ICML (1)

Top co-authors

Zhiyuan Liu (19) Zhenghao Liu (16) Arnold Overwijk (7) Paul Bennett (7) Ge Yu (6) Shi Yu (6) Si Sun (5) Maosong Sun (5) XIA SONG (4) Jiawei Han (4)

Keywords

dense retrieval (11) information retrieval (9) contrastive learning (7) language model (7) zero-shot learning (4) domain adaptation (4) knowledge graph (4) neural information retrieval (3) neural network (3) language model pretraining (3) representation learning (3) weak supervision (3) large language model (3) retrieval-augmented generation (2) visual feature (2) data selection (2) zero-shot generalization (2) retrieval augmentation (2) document ranking (2) language modeling (2)

Papers

ThinkNote: Enhancing Knowledge Integration and Utilization of Large Language Models via Constructivist Cognition Modeling EACL 2026 Efficient Multi-Agent System Training with Data Influence-Oriented Tree Search ACL 2026 Linking Knowledge to Care: Knowledge Graph-Augmented Medical Follow-Up Question Generation EACL 2026 Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning ICLR 2025 ResearchArena: Benchmarking Large Language Models’ Ability to Collect and Organize Information as Research Agents EMNLP 2025 On the Feasibility of In-Context Probing for Data Attribution NAACL 2025 Interpret and Control Dense Retrieval with Sparse Latent Features NAACL 2025 Craw4LLM: Efficient Web Crawling for LLM Pretraining ACL 2025 Understand User Opinions of Large Language Models via LLM-Powered In-the-Moment User Experience Interviews ACL 2025 Fact-Aware Multimodal Retrieval Augmentation for Accurate Medical Radiology Report Generation NAACL 2025 RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards ICLR 2025 Harnessing Webpage UIs for Text-Rich Visual Understanding ICLR 2025 MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module Plugin ACL 2024 Fusion-in-T5: Unifying Variant Signals for Simple and Effective Document Ranking with Attention Fusion COLING 2024 MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models NIPS 2024 Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval ACL 2024 Cleaner Pretraining Corpus Curation with Neural Web Scraping ACL 2024 ED-Copilot: Reduce Emergency Department Wait Time with Language Model Diagnostic Assistance ICML 2024 RAGViz: Diagnose and Visualize Retrieval-Augmented Generation EMNLP 2024 Toolink: Linking Toolkit Creation and Using through Chain-of-Solving on Open-Source Model NAACL 2024 Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In ACL 2023 Structure-Aware Language Model Pretraining Improves Dense Retrieval on Structured Data ACL 2023 Augmenting Zero-Shot Dense Retrievers with Plug-in Mixture-of-Memories EMNLP 2023 Universal Vision-Language Dense Retrieval: Learning A Unified Representation Space for Multi-Modal Retrieval ICLR 2023 CompleQA: Benchmarking the Impacts of Knowledge Graph Completion Methods on Question Answering EMNLP 2023 Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers ACL 2023 COCO-DR: Combating the Distribution Shift in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning EMNLP 2022 Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Negatives EMNLP 2022 Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators ICLR 2022 Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations ACL 2022 Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder EMNLP 2022 Less is More: Pretrain a Strong Siamese Encoder for Dense Text Retrieval Using a Weak Decoder EMNLP 2021 COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining NIPS 2021 Data Augmentation for Abstractive Query-Focused Multi-Document Summarization AAAI 2021 Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision ACL 2021 Contrastive Multi-document Question Generation EACL 2021 Distantly-Supervised Dense Retrieval Enables Open-Domain Question Answering without Evidence Annotation EMNLP 2021 TIAGE: A Benchmark for Topic-Shift Aware Dialog Modeling EMNLP 2021 Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval ICLR 2021 Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision IJCNLP 2021 Multi-Step Reasoning Over Unstructured Text with Beam Dense Retrieval NAACL 2021 Long Document Ranking with Query-Directed Sparse Transformer EMNLP 2020 Adapting Open Domain Fact Extraction and Verification to COVID-FACT through In-Domain Language Modeling EMNLP 2020 Text Classification Using Label Names Only: A Language Model Self-Training Approach EMNLP 2020 Latent Relation Language Models AAAI 2020 Fine-grained Fact Verification with Kernel Graph Attention Network ACL 2020 Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs ACL 2020 Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention ICLR 2020 Towards Interpretable Natural Language Understanding with Explanations as Latent Variables NIPS 2020 Open Domain Web Keyphrase Extraction Beyond Language Modeling EMNLP 2019 Open Domain Web Keyphrase Extraction Beyond Language Modeling IJCNLP 2019 Target-Guided Open-Domain Conversation ACL 2019 Entity-Duet Neural Ranking: Understanding the Role of Knowledge Graph Semantics in Neural Information Retrieval ACL 2018 Automatic Event Salience Identification EMNLP 2018 Automatic Domain Partitioning for Multi-Domain Learning EMNLP 2013