Papers
BioMistral-Clinical: A Scalable Approach to Clinical LLMs via Incremental Learning and RAG
Ziwei Chen, Bernhard Bermeitinger, Christina Niklaus
To Generate or Discriminate? Methodological Considerations for Measuring Cultural Alignment in LLMs
Saurabh Kumar Pandey, Sougata Saha, Monojit Choudhury
Evaluating Human-LLM Representation Alignment: A Case Study on Affective Sentence Generation for Augmentative and Alternative Communication
Shadab Hafiz Choudhury, Asha Kumar, Lara J. Martin
Can LLMs Learn from Their Mistakes? Self-Correcting Instruction Tuning for Named Entity Recognition
Takumi Takahashi, Tomoki Taniguchi, Chencheng Zhu et al.
OPTAGENT: Optimizing Multi-Agent LLM Interactions Through Verbal Reinforcement Learning for Enhanced Reasoning
Zhenyu Bi, Meng Lu, Yang Li et al.
Quantifying and Mitigating Selection Bias in LLMs: A Transferable LoRA Fine-Tuning and Efficient Majority Voting Approach
Blessed Guda, Lawrence Francis, Gabrial Zencha Ashungafac et al.
Consistency Is the Key: Detecting Hallucinations in LLM Generated Text By Checking Inconsistencies About Key Facts
Raavi Gupta, Pranav Hari Panicker, Sumit Bhatia et al.
SOMAJGYAAN: A Dataset for Evaluating LLMs on Bangla Culture, Social Knowledge, and Low-Resource Language Adaptation
Fariha Anjum Shifa, Muhtasim Ibteda Shochcho, Abdullah Ibne Hanif Arean et al.
GeoSAFE - A Novel Geospatial Artificial Intelligence Safety Assurance Framework and Evaluation for LLM Moderation
Nihar Sanda, Rajat Shinde, Sumit Nawathe et al.
Evaluating LLMs’ Reasoning Over Ordered Procedural Steps
Adrita Anika, Md Messal Monem Miah
An Information-Theoretic Approach to Reducing Fertility in LLMs for Manipuri Machine Translation
Telem Joyson Singh, Ranbir Singh Sanasam, Priyankoo Sarmah
Agent-based Automated Claim Matching with Instruction-following LLMs
Dina Pisarevskaya, Arkaitz Zubiaga
Human–LLM Benchmarks for Bangla Dialect Translation: Sylheti and Chittagonian on the BanglaCHQ-Summ Corpus
Nowshin Mahjabin, Ahmed Shafin Ruhan, Mehreen Chowdhury et al.
A Comparative Analysis of Retrieval-Augmented Generation Techniques for Bengali Standard-to-Dialect Machine Translation Using LLMs
K. M. Jubair Sami, Dipto Sumit, Ariyan Hossain et al.
Robustness of LLMs to Transliteration Perturbations in Bangla
Fabiha Haider, Md Farhan Ishmam, Fariha Tanjim Shifat et al.
Computational Story Lab at BLP-2025 Task 1: HateSense: A Multi-Task Learning Framework for Comprehensive Hate Speech Identification using LLMs
Tabia Tanzin Prama, Christopher M. Danforth, Peter Dodds
Barrier Breakers at BLP-2025 Task 2: Enhancing LLM Code Generation Capabilities through Test-Driven Development and Code Interpreter
Sajed Jalil, Shuvo Saha, Hossain Mohammad Seym
CUET_Expelliarmus at BLP2025 Task 2: Leveraging Instruction Translation and Refinement for Bangla-to-Python Code Generation with Open-Source LLMs
Md Kaf Shahrier, Suhana Binta Rashid, Hasan Mesbaul Ali Taher et al.
TeamB2B at BLP-2025 Task 2: BanglaForge: LLM Collaboration with Self-Refinement for Bangla Code Generation
Mahir Labib Dihan, Sadif Ahmed, Md Nafiu Rahman
Benchmarking Hindi LLMs: A New Suite of Datasets and a Comparative Analysis
Anusha Kamath, Kanishk Singla, Rakesh Paul et al.
SmurfCat at SHROOM-CAP: Factual but Awkward? Fluent but Wrong? Tackling Both in LLM Scientific QA
Timur Ionov, Evgenii Nikolaev, Artem Vazhentsev et al.
Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM Evaluation
Naila Shafirni Hidayat, Muhammad Dehan Al Kautsar, Alfan Farizki Wicaksono et al.
Reliable Inline Code Documentation with LLMs: Fine-Grained Evaluation of Comment Quality and Coverage
Rohan Patil, Gaurav Tirodkar, Shubham Gatfane
Beyond the Rubric: Cultural Misalignment in LLM Benchmarks for Sexual and Reproductive Health
Sumon Kanti Dey, Manvi S, Zeel Mehta et al.