Papers
Quantifying and Mitigating Selection Bias in LLMs: A Transferable LoRA Fine-Tuning and Efficient Majority Voting Approach
Blessed Guda, Lawrence Francis, Gabrial Zencha Ashungafac et al.
Consistency Is the Key: Detecting Hallucinations in LLM Generated Text By Checking Inconsistencies About Key Facts
Raavi Gupta, Pranav Hari Panicker, Sumit Bhatia et al.
SOMAJGYAAN: A Dataset for Evaluating LLMs on Bangla Culture, Social Knowledge, and Low-Resource Language Adaptation
Fariha Anjum Shifa, Muhtasim Ibteda Shochcho, Abdullah Ibne Hanif Arean et al.
GeoSAFE - A Novel Geospatial Artificial Intelligence Safety Assurance Framework and Evaluation for LLM Moderation
Nihar Sanda, Rajat Shinde, Sumit Nawathe et al.
Evaluating LLMs’ Reasoning Over Ordered Procedural Steps
Adrita Anika, Md Messal Monem Miah
An Information-Theoretic Approach to Reducing Fertility in LLMs for Manipuri Machine Translation
Telem Joyson Singh, Ranbir Singh Sanasam, Priyankoo Sarmah
Agent-based Automated Claim Matching with Instruction-following LLMs
Dina Pisarevskaya, Arkaitz Zubiaga
Human–LLM Benchmarks for Bangla Dialect Translation: Sylheti and Chittagonian on the BanglaCHQ-Summ Corpus
Nowshin Mahjabin, Ahmed Shafin Ruhan, Mehreen Chowdhury et al.
A Comparative Analysis of Retrieval-Augmented Generation Techniques for Bengali Standard-to-Dialect Machine Translation Using LLMs
K. M. Jubair Sami, Dipto Sumit, Ariyan Hossain et al.
Robustness of LLMs to Transliteration Perturbations in Bangla
Fabiha Haider, Md Farhan Ishmam, Fariha Tanjim Shifat et al.
Computational Story Lab at BLP-2025 Task 1: HateSense: A Multi-Task Learning Framework for Comprehensive Hate Speech Identification using LLMs
Tabia Tanzin Prama, Christopher M. Danforth, Peter Dodds
Barrier Breakers at BLP-2025 Task 2: Enhancing LLM Code Generation Capabilities through Test-Driven Development and Code Interpreter
Sajed Jalil, Shuvo Saha, Hossain Mohammad Seym
CUET_Expelliarmus at BLP2025 Task 2: Leveraging Instruction Translation and Refinement for Bangla-to-Python Code Generation with Open-Source LLMs
Md Kaf Shahrier, Suhana Binta Rashid, Hasan Mesbaul Ali Taher et al.
TeamB2B at BLP-2025 Task 2: BanglaForge: LLM Collaboration with Self-Refinement for Bangla Code Generation
Mahir Labib Dihan, Sadif Ahmed, Md Nafiu Rahman
Benchmarking Hindi LLMs: A New Suite of Datasets and a Comparative Analysis
Anusha Kamath, Kanishk Singla, Rakesh Paul et al.
SmurfCat at SHROOM-CAP: Factual but Awkward? Fluent but Wrong? Tackling Both in LLM Scientific QA
Timur Ionov, Evgenii Nikolaev, Artem Vazhentsev et al.
Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM Evaluation
Naila Shafirni Hidayat, Muhammad Dehan Al Kautsar, Alfan Farizki Wicaksono et al.
Reliable Inline Code Documentation with LLMs: Fine-Grained Evaluation of Comment Quality and Coverage
Rohan Patil, Gaurav Tirodkar, Shubham Gatfane
Beyond the Rubric: Cultural Misalignment in LLM Benchmarks for Sexual and Reproductive Health
Sumon Kanti Dey, Manvi S, Zeel Mehta et al.
Non-Determinism of “Deterministic” LLM System Settings in Hosted Environments
Berk Atıl, Sarp Aykent, Alexa Chittams et al.
Test Set Quality in Multilingual LLM Evaluation
Chalamalasetti Kranti, Gabriel Bernier-Colborne, Yvan Gauthier et al.
LLM Driven Legal Text Analytics: A Case Study For Food Safety Violation Cases
Suyog Joshi, Soumyajit Basu, Lipika Dey et al.
MEDEQUALQA: Evaluating Biases in LLMs with Counterfactual Reasoning
Rajarshi Ghosh, Abhay Gupta, Hudson McBride et al.