Research Explorer

ESG Impact Type Classification: Leveraging Strategic Prompt Engineering and LLM Fine-Tuning

Soumya Mishra

2023 AACL

Do LLMs Need Inherent Reasoning Before Reinforcement Learning? A Study in Korean Self-Correction

Hongjin Kim, Jaewook Lee, Kiyoung Lee et al.

2025 AACL

Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs

Yohan Mathew, Ollie Matthews, Robert McCarthy et al.

2025 AACL

Multilingual, Not Multicultural: Uncovering the Cultural Empathy Gap in LLMs through a Comparative Empathetic Dialogue Benchmark

Woojin Lee, Yujin Sim, Hongjin Kim et al.

2025 AACL

Chain-of-Query: Unleashing the Power of LLMs in SQL-Aided Table Understanding via Multi-Agent Collaboration

Songyuan Sui, Hongyi Liu, Serena Liu et al.

2025 AACL

ProofTeller: Exposing recency bias in LLM reasoning and its side effects on communication

Mayank Jobanputra, Alisa Kovtunova, Brisca Balthes et al.

2025 AACL

An Adversary-Resistant Multi-Agent LLM System via Credibility Scoring

Sana Ebrahimi, Mohsen Dehghankar, Abolfazl Asudeh

2025 AACL

Are LLMs Rigorous Logical Reasoners? Empowering Natural Language Proof Generation by Stepwise Decoding with Contrastive Learning

Ying Su, Mingwen Liu, Zhijiang Guo

2025 AACL

Exploring Working Memory Capacity in LLMs: From Stressors to Human-Inspired Strategies

Eunjin Hong, Sumin Cho, Juae Kim

2025 AACL

Doppelganger-JC: Benchmarking the LLMs’ Understanding of Cross-Lingual Homographs between Japanese and Chinese

Yuka Kitamura, Jiahao Huang, Akiko Aizawa

2025 AACL

LLMs Do Not See Age: Assessing Demographic Bias in Automated Systematic Review Synthesis

Favour Y. Aghaebe, Elizabeth A Williams, Tanefa Apekey et al.

2025 AACL

Decode Like a Clinician: Enhancing LLM Fine-Tuning with Temporal Structured Data Representation

Daniel Fadlon, David Dov, Aviya Bennett et al.

2025 AACL

The Confidence Paradox: Can LLM Know When It’s Wrong?

Sahil Tripathi, MD Tabrez Nafis, Imran Hussain et al.

2025 AACL

Large Temporal Models: Unlocking Temporal Understanding in LLMs for Temporal Relation Classification

Omri Homburger, Kfir Bar

2025 AACL

Interpreting the Effects of Quantization on LLMs

Manpreet Singh, Hassan Sajjad

2025 AACL

What Would You Ask When You First Saw a2+b2=c2? Evaluating LLM on Curiosity-Driven Question Generation

Shashidhar Reddy Javaji, Zining Zhu

2025 AACL

Can AI Validate Science? Benchmarking LLMs on Claim →Evidence Reasoning in AI Papers

Shashidhar Reddy Javaji, Yupeng Cao, Haohang Li et al.

2025 AACL

More Than a Score: Probing the Impact of Prompt Specificity on LLM Code Generation

Yangtian Zi, Harshitha Menon, Arjun Guha

2025 AACL

Pragmatic Theories Enhance Understanding of Implied Meanings in LLMs

Takuma Sato, Seiya Kawano, Koichiro Yoshino

2025 AACL

Crypto-LLM: Two-Stage Language Model Pre-training with Ciphered and Natural Language Data

Yohei Kobashi, Fumiya Uchiyama, Takeshi Kojima et al.

2025 AACL

From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning

Chalamalasetti Kranti, Sherzod Hakimov, David Schlangen

2025 AACL

Small Changes, Large Consequences: Analyzing the Allocational Fairness of LLMs in Hiring Contexts

Preethi Seshadri, Hongyu Chen, Sameer Singh et al.

2025 AACL

Revisiting Word Embeddings in the LLM Era

Yash Mahajan, Matthew Freestone, Naman Bansal et al.

2025 AACL

Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language Models

Kristian Noullet, Ayoub Ourgani, Niklas Thomas Lakner et al.

2025 AACL

Found in Translation: Measuring Multilingual LLM Consistency as Simple as Translate then Evaluate

Ashim Gupta, Maitrey Mehta, Zhichao Xu et al.

2025 AACL