Papers
Evaluating the Simplification of Brazilian Legal Rulings in LLMs Using Readability Scores as a Target
Antonio Flavio Paula, Celso Camilo-Junior
Findings of the WMT24 General Machine Translation Shared Task: The LLM Era Is Here but MT Is Not Solved Yet
Tom Kocmi, Eleftherios Avramidis, Rachel Bawden et al.
Are LLMs Breaking MT Metrics? Results of the WMT24 Metrics Shared Task
Markus Freitag, Nitika Mathur, Daniel Deutsch et al.
Findings of the Quality Estimation Shared Task at WMT 2024: Are LLMs Closing the Gap in QE?
Chrysoula Zerva, Frederic Blain, José G. C. De Souza et al.
Choose the Final Translation from NMT and LLM Hypotheses Using MBR Decoding: HW-TSC’s Submission to the WMT24 General MT Shared Task
Zhanglin Wu, Daimeng Wei, Zongyao Li et al.
Document-level Translation with LLM Reranking: Team-J at WMT 2024 General Translation Task
Keito Kudo, Hiroyuki Deguchi, Makoto Morishita et al.
CUNI at WMT24 General Translation Task: LLMs, (Q)LoRA, CPO and Model Merging
Miroslav Hrabal, Josef Jon, Martin Popel et al.
IKUN for WMT24 General MT Task: LLMs Are Here for Multilingual Machine Translation
Baohao Liao, Christian Herold, Shahram Khadivi et al.
CoST of breaking the LLMs
Ananya Mukherjee, Saumitra Yadav, Manish Shrivastava
Killing Two Flies with One Stone: An Attempt to Break LLMs Using English-Icelandic Idioms and Proper Names
Bjarki Ármannsson, Hinrik Hafsteinsson, Atli Jasonarson et al.
Machine Translation Metrics Are Better in Evaluating Linguistic Errors on LLMs than on Encoder-Decoder Systems
Eleftherios Avramidis, Shushen Manakhimova, Vivien Macketanz et al.
Chitranuvad: Adapting Multi-lingual LLMs for Multimodal Translation
Shaharukh Khan, Ayush Tarun, Ali Faraz et al.
Context-Aware LLM Translation System Using Conversation Summarization and Dialogue History
Mingi Sung, Seungmin Lee, Jiwon Kim et al.
Analysing Translation Artifacts: A Comparative Study of LLMs, NMTs, and Human Translations
Fedor Sizov, Cristina España-Bonet, Josef Van Genabith et al.
Shortcomings of LLMs for Low-Resource Translation: Retrieval and Understanding Are Both the Problem
Sara Court, Micha Elsner
Break the Checkbox: Challenging Closed-Style Evaluations of Cultural Alignment in LLMs
Mohsinul Kabir, Ajwad Abrar, Sophia Ananiadou
Revisiting LLM Value Probing Strategies: Are They Robust and Expressive?
Siqi Shen, Mehar Singh, Lajanugen Logeswaran et al.
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors
Jakub Macina, Nico Daheim, Ido Hakimi et al.
Preemptive Detection and Correction of Misaligned Actions in LLM Agents
Haishuo Fang, Xiaodan Zhu, Iryna Gurevych
From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning
David Dinucu-Jianu, Jakub Macina, Nico Daheim et al.
MAC-Tuning: LLM Multi-Compositional Problem Reasoning with Enhanced Knowledge Boundary Awareness
Junsheng Huang, Zhitao He, Yuchen Huang et al.
IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents
Hengyu An, Jinghuai Zhang, Tianyu Du et al.
Molecular String Representation Preferences in Pretrained LLMs: A Comparative Study in Zero- & Few-Shot Molecular Property Prediction
George Arthur Baker, Mario Sanz-Guerrero, Katharina von der Wense
DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation
Ziming You, Yumiao Zhang, Dexuan Xu et al.