Papers
5,479 papers found
On scalable oversight with weak LLMs judging strong LLMs
Zachary Kenton, Noah Y. Siegel, János Kramár et al.
LLM Attributor: Interactive Visual Attribution for LLM Generation
Seongmin Lee, Zijie J. Wang, Aishwarya Chakravarthy et al.
Can LLMs Learn from Previous Mistakes? Investigating LLMs’ Errors to Boost for Reasoning
Yongqi Tong, Dawei Li, Sizhe Wang et al.
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs
Siyuan Wang, Zhongyu Wei, Yejin Choi et al.
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
Yi Zeng, Hongpeng Lin, Jingwen Zhang et al.
Don’t Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration
Shangbin Feng, Weijia Shi, Yike Wang et al.
Can LLMs substitute SQL? Comparing Resource Utilization of Querying LLMs versus Traditional Relational Databases
Xiang Zhang, Khatoon Khedri, Reza Rawassizadeh
LLMs Beyond English: Scaling the Multilingual Capability of LLMs with Cross-Lingual Feedback
Wen Lai, Mohsen Mesgar, Alexander Fraser
When Do LLMs Need Retrieval Augmentation? Mitigating LLMs’ Overconfidence Helps Retrieval Augmentation
Shiyu Ni, Keping Bi, Jiafeng Guo et al.
Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements
Ming Li, Jiuhai Chen, Lichang Chen et al.
Can LLMs get help from other LLMs without revealing private information?
Florian Hartmann, Duc-Hieu Tran, Peter Kairouz et al.
Do LLMs Recognize me, When I is not me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts
Metehan Oğuz, Yusuf Ciftci, Yavuz Faruk Bakman
LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates
Ying Shen, Lifu Huang
LLMs + Persona-Plug = Personalized LLMs
Jiongnan Liu, Yutao Zhu, Shuting Wang et al.
Can LLMs Reason About Program Semantics? A Comprehensive Evaluation of LLMs on Formal Specification Inference
Thanh Le-Cong, Bach Le, Toby Murray
How LLMs Comprehend Temporal Meaning in Narratives: A Case Study in Cognitive Evaluation of LLMs
Karin De Langis, Jong Inn Park, Andreas Schramm et al.
Can LLMs Understand Unvoiced Speech? Exploring EMG-to-Text Conversion with LLMs
Payal Mohapatra, Akash Pandey, Xiaoyuan Zhang et al.
CiteLab: Developing and Diagnosing LLM Citation Generation Workflows via the Human-LLM Interaction
Jiajun Shen, Tong Zhou, Yubo Chen et al.
Are LLMs Rational Investors? A Study on the Financial Bias in LLMs
Yuhang Zhou, Yuchen Ni, Zhiheng Xi et al.
LLMs Protégés: Tutoring LLMs with Knowledge Gaps Improves Student Learning Outcome
Andrei Kucharavy, Cyril Vallez, Dimitri Percia David
Are LLMs (Really) Ideological? An IRT-based Analysis and Alignment Tool for Perceived Socio-Economic Bias in LLMs
Jasmin Wachter, Michael Radloff, Maja Smolej et al.
Can LLMs Rank the Harmfulness of Smaller LLMs? We are Not There Yet
Berk Atil, Vipul Gupta, Sarkar Snigdha Sarathi Das et al.
Is LLM a Reliable Reviewer? A Comprehensive Evaluation of LLM on Automatic Paper Reviewing Tasks
Ruiyang Zhou, Lu Chen, Kai Yu
ASOS at Arabic LLMs Hallucinations 2024: Can LLMs detect their Hallucinations :)
Serry Taiseer Sibaee, Abdullah I. Alharbi, Samar Ahmed et al.
Efficient Solutions For An Intriguing Failure of LLMs: Long Context Window Does Not Mean LLMs Can Analyze Long Sequences Flawlessly
Peyman Hosseini, Ignacio Castro, Iacopo Ghinassi et al.