Papers
RelEdit: Evaluating Conceptual Knowledge Editing in Language Models via Relational Reasoning
Yifan Niu, Miao Peng, Nuo Chen et al.
Relevance Scores Calibration for Ranked List Truncation via TMP Adapter
Pavel Posokhov, Sergei Masliukhin, Skrylnikov Stepan et al.
Relevant or Random: Can LLMs Truly Perform Analogical Reasoning?
Chengwei Qin, Wenhan Xia, Tan Wang et al.
Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction
Xiaowei Zhu, Yubing Ren, Yanan Cao et al.
RemoteRAG: A Privacy-Preserving LLM Cloud RAG Service
Yihang Cheng, Lan Zhang, Junyang Wang et al.
Removal of Hallucination on Hallucination: Debate-Augmented RAG
Wentao Hu, Wengyu Zhang, Yiyang Jiang et al.
Removing Prompt-template Bias in Reinforcement Learning from Human Feedback
Chaojie Wang, Haonan Shi, Long Tian et al.
RePanda: Pandas-powered Tabular Verification and Reasoning
Atoosa Chegini, Keivan Rezaei, Hamid Eghbalzadeh et al.
REPA: Russian Error Types Annotation for Evaluating Text Generation and Judgment Capabilities
Alexander Pugachev, Alena Fenogenova, Vladislav Mikhailov et al.
Representation Bending for Large Language Model Safety
Ashkan Yousefpour, Taeheon Kim, Ryan Sungmo Kwon et al.
Representations of Fact, Fiction and Forecast in Large Language Models: Epistemics and Attitudes
Meng Li, Michael Vrazitulis, David Schlangen
REPRO-Bench: Can Agentic AI Systems Assess the Reproducibility of Social Science Research?
Chuxuan Hu, Liyun Zhang, Yeji Lim et al.
Reproducing the Argument Quality Prediction of Project Debater
Ines Zelch, Matthias Hagen, Benno Stein et al.
ReproHum #0033-05: Human Evaluation of Factuality from A Multidisciplinary Perspective
Andra-Maria Florescu, Marius Micluța-Câmpeanu, Stefana Arina Tabusca et al.
ReproHum #0067-01: A Reproduction of the Evaluation of Cross-Lingual Summarization
Supryadi, Chuang Liu, Deyi Xiong
ReproHum #0669-08: Reproducing Sentiment Transfer Evaluation
Kristýna Onderková, Mateusz Lango, Patrícia Schmidtová et al.
ReproHum #0729-04: Partial reproduction of the human evaluation of the MemSum and NeuSum summarisation systems
Simon Mille, Michela Lorandi
ReproHum: #0744-02: Investigating the Reproducibility of Semantic Preservation Human Evaluations
Mohammad Arvan, Natalie Parde
Reranking-based Generation for Unbiased Perspective Summarization
Narutatsu Ri, Nicholas Deas, Kathleen McKeown
Re-ranking Using Large Language Models for Mitigating Exposure to Harmful Content on Social Media Platforms
Rajvardhan Oak, Muhammad Haroon, Claire Wonjeong Jo et al.
ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision
Dosung Lee, Wonjun Oh, Boyoung Kim et al.
Research Borderlands: Analysing Writing Across Research Cultures
Shaily Bhatt, Tal August, Maria Antoniak