Research Explorer

PUB: A Pragmatics Understanding Benchmark for Assessing LLMs’ Pragmatics Capabilities

Settaluri Sravanthi, Meet Doshi, Pavan Tankala et al.

2024 ACL

MM-LLMs: Recent Advances in MultiModal Large Language Models

Duzhen Zhang, Yahan Yu, Jiahua Dong et al.

2024 ACL

LLMs as Narcissistic Evaluators: When Ego Inflates Evaluation Scores

Yiqi Liu, Nafise Moosavi, Chenghua Lin

2024 ACL

CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs’ Mathematical Reasoning Capabilities

Yujun Mao, Yoon Kim, Yilun Zhou

2024 ACL

RaDA: Retrieval-augmented Web Agent Planning with LLMs

Minsoo Kim, Victor Bursztyn, Eunyee Koh et al.

2024 ACL

Code Needs Comments: Enhancing Code LLMs with Comment Augmentation

Demin Song, Honglin Guo, Yunhua Zhou et al.

2024 ACL

LLMs cannot find reasoning errors, but can correct them given the error location

Gladys Tyen, Hassan Mansoor, Victor Carbune et al.

2024 ACL

Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling

David Dukić, Jan Šnajder

2024 ACL

Fantastic Semantics and Where to Find Them: Investigating Which Layers of Generative LLMs Reflect Lexical Semantics

Zhu Liu, Cunliang Kong, Ying Liu et al.

2024 ACL

Combining Hierachical VAEs with LLMs for clinically meaningful timeline summarisation in social media

Jiayu Song, Jenny Chim, Adam Tsakalidis et al.

2024 ACL

S3-DST: Structured Open-Domain Dialogue Segmentation and State Tracking in the Era of LLMs

Sarkar Snigdha Sarathi Das, Chirag Shah, Mengting Wan et al.

2024 ACL

Automatic Bug Detection in LLM-Powered Text-Based Games Using LLMs

Claire Jin, Sudha Rao, Xiangyu Peng et al.

2024 ACL

Hire a Linguist!: Learning Endangered Languages in LLMs with In-Context Linguistic Descriptions

Kexun Zhang, Yee Choi, Zhenqiao Song et al.

2024 ACL

From Tarzan to Tolkien: Controlling the Language Proficiency Level of LLMs for Content Generation

Ali Malik, Stephen Mayhew, Christopher Piech et al.

2024 ACL

A Critical Study of What Code-LLMs (Do Not) Learn

Abhinav Anand, Shweta Verma, Krishna Narasimhan et al.

2024 ACL

Defending LLMs against Jailbreaking Attacks via Backtranslation

Yihan Wang, Zhouxing Shi, Andrew Bai et al.

2024 ACL

Ask LLMs Directly, “What shapes your bias?”: Measuring Social Bias in Large Language Models

Jisu Shin, Hoyun Song, Huije Lee et al.

2024 ACL

Selective Prompting Tuning for Personalized Conversations with LLMs

Qiushi Huang, Xubo Liu, Tom Ko et al.

2024 ACL

mBLIP: Efficient Bootstrapping of Multilingual Vision-LLMs

Gregor Geigle, Abhay Jain, Radu Timofte et al.

2024 ACL

John vs. Ahmed: Debate-Induced Bias in Multilingual LLMs

Anastasiia Demidova, Hanin Atwany, Nour Rabih et al.

2024 ACL

Arabic Train at NADI 2024 shared task: LLMs’ Ability to Translate Arabic Dialects into Modern Standard Arabic

Anastasiia Demidova, Hanin Atwany, Nour Rabih et al.

2024 ACL

SMASH at StanceEval 2024: Prompt Engineering LLMs for Arabic Stance Detection

Youssef Al Hariri, Ibrahim Abu Farha

2024 ACL

Sövereign at The Perspective Argument Retrieval Shared Task 2024: Using LLMs with Argument Mining

Robert Günzler, Özge Sevgili, Steffen Remus et al.

2024 ACL

Open (Clinical) LLMs are Sensitive to Instruction Phrasings

Alberto Mario Ceballos-Arroyo, Monica Munnangi, Jiuding Sun et al.

2024 ACL

Can Rule-Based Insights Enhance LLMs for Radiology Report Classification? Introducing the RadPrompt Methodology.

Panagiotis Fytas, Anna Breger, Ian Selby et al.

2024 ACL

Papers