Artificial Intelligence › Core AI ›

Large Language Models

6405 directly classified papers

Papers per year

Papers

Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents NAACL 2025

Aligning to What? Limits to RLHF Based Alignment NAACL 2025

Beyond Words: Exploring Cultural Value Sensitivity in Multimodal Models NAACL 2025

Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving NAACL 2025

Evaluation of LLMs-based Hidden States as Author Representations for Psychological Human-Centered NLP Tasks NAACL 2025

ThoughtSculpt: Reasoning with Intermediate Revision and Search NAACL 2025

On A Scale From 1 to 5: Quantifying Hallucination in Faithfulness Evaluation NAACL 2025

Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning NAACL 2025

Towards Long Context Hallucination Detection NAACL 2025

Accounting for Sycophancy in Language Model Uncertainty Estimation NAACL 2025

Meta-Reasoning Improves Tool Use in Large Language Models NAACL 2025

LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models NAACL 2025

AssertionBench: A Benchmark to Evaluate Large-Language Models for Assertion Generation NAACL 2025

DHP Benchmark: Are LLMs Good NLG Evaluators? NAACL 2025

GraphEval36K: Benchmarking Coding and Reasoning Capabilities of Large Language Models on Graph Datasets NAACL 2025

SimulBench: Evaluating Language Models with Creative Simulation Tasks NAACL 2025

ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning NAACL 2025

2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision NAACL 2025

Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation NAACL 2025

Alleviating Hallucinations of Large Language Models through Induced Hallucinations NAACL 2025

MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts NAACL 2025

Hierarchical Speculative Decoding with Dynamic Window NAACL 2025

Q-FAKER: Query-free Hard Black-box Attack via Controlled Generation NAACL 2025

PRDetect: Perturbation-Robust LLM-generated Text Detection Based on Syntax Tree NAACL 2025

Rationale Behind Essay Scores: Enhancing S-LLM’s Multi-Trait Essay Scoring with Rationale Generated by LLMs NAACL 2025