Papers
2,781 papers found
Beyond Metrics: Evaluating LLMs Effectiveness in Culturally Nuanced, Low-Resource Real-World Scenarios
Millicent Ochieng, Varun Gumma, Sunayana Sitaram et al.
Simulating Emotional Intelligence in LLMs through Behavioral Conditioning and Analogical Retrieval
G.Sai Linisha Reddy, Mounil Hiren Kankhara, Mridul Maheshwari et al.
Can Stories Help LLMs Reason? Curating Information Space Through Narrative
Vahid Sadiri Javadi, Johanne Trippas, Yash Kumar Lal et al.
On Integrating LLMs Into an Argument Annotation Workflow
Robin Schaefer
CUET_SR34 at CQs-Gen 2025: Critical Question Generation via Few-Shot LLMs – Integrating NER and Argument Schemes
Sajib Bhattacharjee, Tabassum Basher Rashfi, Samia Rahman et al.
ARG2ST at CQs-Gen 2025: Critical Questions Generation through LLMs and Usefulness-based Selection
Alan Ramponi, Gaudenzia Genoni, Sara Tonelli
MateInfoUB: A Real-World Benchmark for Testing LLMs in Competitive, Multilingual, and Multimodal Educational Tasks
Marius Dumitran, Mihnea Buca, Theodor Moroianu
Alignment Drift in CEFR-prompted LLMs for Interactive Spanish Tutoring
Mina Almasi, Ross Deans Kristensen-McLachlan
Can LLMs Effectively Simulate Human Learners? Teachers’ Insights from Tutoring LLM Students
Daria Martynova, Jakub Macina, Nico Daheim et al.
Adapting LLMs for Minimal-edit Grammatical Error Correction
Ryszard Staruch, Filip Gralinski, Daniel Dzienisiewicz
Do LLMs Give Psychometrically Plausible Responses in Educational Assessments?
Andreas Säuberli, Diego Frassinelli, Barbara Plank
Exploiting the English Vocabulary Profile for L2 word-level vocabulary assessment with LLMs
Stefano Bannò, Kate M. Knill, Mark J. F. Gales
Lessons Learned in Assessing Student Reflections with LLMs
Mohamed Elaraby, Diane Litman
Name of Thrones: How Do LLMs Rank Student Names in Status Hierarchies Based on Race and Gender?
Annabella Sakunkoo, Jonathan Sakunkoo
Exploring LLMs for Predicting Tutor Strategy and Student Outcomes in Dialogues
Fareya Ikram, Alexander Scarlatos, Andrew Lan
Can LLMs Reliably Simulate Real Students’ Abilities in Mathematics and Reading Comprehension?
KV Aditya Srivatsa, Kaushal Maurya, Ekaterina Kochmar
MSA at BEA 2025 Shared Task: Disagreement-Aware Instruction Tuning for Multi-Dimensional Evaluation of LLMs as Math Tutors
Baraa Hikal, Mohamed Basem, Islam Oshallah et al.
TutorMind at BEA 2025 Shared Task: Leveraging Fine-Tuned LLMs and Data Augmentation for Mistake Identification
Fatima Dekmak, Christian Khairallah, Wissam Antoun
Fine-tuning LLMs to Extract Epilepsy Seizure Frequency Data from Health Records
Ben Holgate, Joe Davies, Shichao Fang et al.
LLMs as Medical Safety Judges: Evaluating Alignment with Human Annotation in Patient-Facing QA
Yella Diekmann, Chase Fensore, Rodrigo Carrillo-Larco et al.
Virtual CRISPR: Can LLMs Predict CRISPR Screen Results?
Steven Song, Abdalla Abdrabou, Asmita Dabholkar et al.
CUNI-a at ArchEHR-QA 2025: Do we need Giant LLMs for Clinical QA?
Vojtech Lanz, Pavel Pecina
SzegedAI at ArchEHR-QA 2025: Combining LLMs with traditional methods for grounded question answering
Soma Nagy, Bálint Nyerges, Zsombor Kispéter et al.
LIMICS at ArchEHR-QA 2025: Prompting LLMs Beats Fine-Tuned Embeddings
Adam Remaki, Armand Violle, Vikram Natraj et al.