conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Large Language Models
6,405 papers
Papers per year
2007: 3
2017: 2
2018: 3
2019: 10
2020: 49
2021: 53
2022: 188
2023: 558
2024: 1910
2025: 3619
2026: 10
Papers
Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents
NAACL 2025
Aligning to What? Limits to RLHF Based Alignment
NAACL 2025
Beyond Words: Exploring Cultural Value Sensitivity in Multimodal Models
NAACL 2025
Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving
NAACL 2025
Evaluation of LLMs-based Hidden States as Author Representations for Psychological Human-Centered NLP Tasks
NAACL 2025
ThoughtSculpt: Reasoning with Intermediate Revision and Search
NAACL 2025
Using Linguistic Entrainment to Evaluate Large Language Models for Use in Cognitive Behavioral Therapy
NAACL 2025
On A Scale From 1 to 5: Quantifying Hallucination in Faithfulness Evaluation
NAACL 2025
LITERA: An LLM Based Approach to Latin-to-English Translation
NAACL 2025
Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning
NAACL 2025
Towards Long Context Hallucination Detection
NAACL 2025
Accounting for Sycophancy in Language Model Uncertainty Estimation
NAACL 2025
Zero-Shot Keyphrase Generation: Investigating Specialized Instructions and Multi-sample Aggregation on Large Language Models
NAACL 2025
Meta-Reasoning Improves Tool Use in Large Language Models
NAACL 2025
GAIfE: Using GenAI to Improve Literacy in Low-resourced Settings
NAACL 2025
Hard Emotion Test Evaluation Sets for Language Models
NAACL 2025
UCL-Bench: A Chinese User-Centric Legal Benchmark for Large Language Models
NAACL 2025
LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models
NAACL 2025
AssertionBench: A Benchmark to Evaluate Large-Language Models for Assertion Generation
NAACL 2025
DHP Benchmark: Are LLMs Good NLG Evaluators?
NAACL 2025
GraphEval36K: Benchmarking Coding and Reasoning Capabilities of Large Language Models on Graph Datasets
NAACL 2025
SimulBench: Evaluating Language Models with Creative Simulation Tasks
NAACL 2025
ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning
NAACL 2025
2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision
NAACL 2025
Demystifying the Power of Large Language Models in Graph Generation
NAACL 2025
<
1
…
137
138
139
…
257
>