Papers
Broken Words, Broken Performance: Effect of Tokenization on Performance of LLMs
Sachin Pawar, Manoj Apte, Kshitij Jadhav et al.
LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring
Chloe Li, Noah Y. Siegel
Improving LLM’s Attachment to External Knowledge In Dialogue Generation Tasks Through Entity Anonymization
Hadi Sheikhi, Chenyang Huang, Osmar Zaiane
Testing Simulation Theory in LLMs’ Theory of Mind
Koshiro Aoki, Daisuke Kawahara
Adaptive Coopetition: Leveraging Coarse Verifier Signals for Resilient Multi-Agent LLM Reasoning
Wendy Yaqiao Liu, Rui Jerry Huang, Anastasia Miin et al.
Two Step Automatic Post Editing of Patent Machine Translation based on Pre-trained Encoder Models and LLMs
Kosei Buma, Takehito Utsuro, Masaaki Nagata
Are LLMs Good for Semantic Role Labeling via Question Answering?: A Preliminary Analysis
Ritwik Raghav, Abhik Jana
Visualizing and Benchmarking LLM Factual Hallucination Tendencies via Internal State Analysis and Clustering
Nathan Mao, Varun Kaushik, Shreya Shivkumar et al.
VariantBench: A Framework for Evaluating LLMs on Justifications for Genetic Variant Interpretation
Humair Basharat, Simon Plotkin, Charlotte Le et al.
Tutorial on Trustworthy Legal Text Processing with LLMs: Retrieval, Rhetorical Roles, Summarization, and Trustworthy Generation
Anand Kumar M, Sangeetha S, Manikandan R et al.
Swallowing the Poison Pills: Insights from Vulnerability Disparity Among LLMs
Peng Yifeng, Zhizheng Wu, Chen Chen
LLMs as Architects and Critics for Multi-Source Opinion Summarization
Anuj Attri, Arnav Attri, Suman Banerjee et al.
Atomic Calibration of LLMs in Long-Form Generations
Caiqi Zhang, Ruihan Yang, Zhisong Zhang et al.
Estimating Causal Effects of Text Interventions Leveraging LLMs
Siyi Guo, Myrl G Marmarelis, Fred Morstatter et al.
HalluCounter: Reference-free LLM Hallucination Detection in the Wild!
Ashok Urlana, Gopichand Kanumolu, Charaka Vinayak Kumar et al.
Smruti: Grammatical Error Correction for Gujarati using LLMs with Non-Parametric Memory
Vrund Dobariya, Jatayu Baxi, Bhavika Gambhava et al.
LLM in the Loop: Creating the ParaDeHate Dataset for Hate Speech Detoxification
Shuzhou Yuan, Ercong Nie, Lukas Kouba et al.
Emotion-Aware Dysarthric Speech Reconstruction: LLMs and Multimodal Evaluation with MCDS
Kaushal Attaluri, Radhika Mamidi, Sireesha Chittepu et al.
Illusions of Relevance: Arbitrary Content Injection Attacks Deceive Retrievers, Rerankers, and LLM Judges
Manveer Singh Tamber, Jimmy Lin
Learning from Hallucinations: Mitigating Hallucinations in LLMs via Internal Representation Intervention
Sora Kadotani, Kosuke Nishida, Kyosuke Nishida
BioMistral-Clinical: A Scalable Approach to Clinical LLMs via Incremental Learning and RAG
Ziwei Chen, Bernhard Bermeitinger, Christina Niklaus
To Generate or Discriminate? Methodological Considerations for Measuring Cultural Alignment in LLMs
Saurabh Kumar Pandey, Sougata Saha, Monojit Choudhury
Evaluating Human-LLM Representation Alignment: A Case Study on Affective Sentence Generation for Augmentative and Alternative Communication
Shadab Hafiz Choudhury, Asha Kumar, Lara J. Martin
Can LLMs Learn from Their Mistakes? Self-Correcting Instruction Tuning for Named Entity Recognition
Takumi Takahashi, Tomoki Taniguchi, Chencheng Zhu et al.
OPTAGENT: Optimizing Multi-Agent LLM Interactions Through Verbal Reinforcement Learning for Enhanced Reasoning
Zhenyu Bi, Meng Lu, Yang Li et al.