Papers
Beyond English: Evaluating LLMs for Arabic Grammatical Error Correction
Sang Kwon, Gagan Bhatia, El Moatez Billah Nagoudi et al.
Analyzing Multilingual Competency of LLMs in Multi-Turn Instruction Following: A Case Study of Arabic
Sabri Boughorbel, Majd Hawasly
Raphael at ArAIEval Shared Task: Understanding Persuasive Language and Tone, an LLM Approach
Utsav Shukla, Manan Vyas, Shailendra Tiwari
Synthetic Dialogue Dataset Generation using LLM Agents
Yelaman Abdullin, Diego Molla, Bahadorreza Ofoghi et al.
Text Encoders Lack Knowledge: Leveraging Generative LLMs for Domain-Specific Semantic Textual Similarity
Joseph Gatto, Omar Sharif, Parker Seegmiller et al.
Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs
Xue-Yong Fu, Md Tahmid Rahman Laskar, Cheng Chen et al.
Post Turing: Mapping the landscape of LLM Evaluation
Alexey Tikhonov, Ivan P. Yamshchikov
Retrieval-based Evaluation for LLMs: A Case Study in Korean Legal QA
Cheol Ryu, Seolhwa Lee, Subeen Pang et al.
Findings of the 2023 Conference on Machine Translation (WMT23): LLMs Are Here but Not Quite There Yet
Tom Kocmi, Eleftherios Avramidis, Rachel Bawden et al.
Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs
Longyue Wang, Zhaopeng Tu, Yan Gu et al.
Embed_Llama: Using LLM Embeddings for the Metrics Shared Task
Sören Dreano, Derek Molloy, Noel Murphy
When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection
Xiangyu Zhang, Hexin Liu, Kaishuai Xu et al.
NumeroLogic: Number Encoding for Enhanced LLMs’ Numerical Reasoning
Eli Schwartz, Leshem Choshen, Joseph Shtok et al.
Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
Oded Ovadia, Menachem Brief, Moshik Mishaeli et al.
Systematic Biases in LLM Simulations of Debates
Amir Taubenfeld, Yaniv Dover, Roi Reichart et al.
On Fake News Detection with LLM Enhanced Semantics Mining
Xiaoxiao Ma, Yuchen Zhang, Kaize Ding et al.
HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs
Jocelyn Shen, Joel Mire, Hae Won Park et al.
LLMs Are Zero-Shot Context-Aware Simultaneous Translators
Roman Koshkin, Katsuhito Sudoh, Satoshi Nakamura
AgentReview: Exploring Peer Review Dynamics with LLM Agents
Yiqiao Jin, Qinlin Zhao, Yiyang Wang et al.
Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement
Weimin Xiong, Yifan Song, Xiutian Zhao et al.
SHIELD: Evaluation and Defense Strategies for Copyright Compliance in LLM Text Generation
Xiaoze Liu, Ting Sun, Tianyang Xu et al.
I Need Help! Evaluating LLM’s Ability to Ask for Users’ Support: A Case Study on Text-to-SQL Generation
Cheng-Kuang Wu, Zhi Rui Tam, Chao-Chung Wu et al.
ASETF: A Novel Method for Jailbreak Attack on LLMs through Translate Suffix Embeddings
Hao Wang, Hao Li, Minlie Huang et al.
CUTE: Measuring LLMs’ Understanding of Their Tokens
Lukas Edman, Helmut Schmid, Alexander Fraser