← Resources & Methods

Natural Language Processing › Resources & Methods ›

Text Representation

2246 directly classified papers

Papers per year

Papers

Transfer of Structural Knowledge from Synthetic Languages ACL 2025

PSET: a Phonetics-Semantics Evaluation Testbed EMNLP 2025

Less Is MuRE: Revisiting Shallow Knowledge Graph Embeddings EMNLP 2025

The Gemma Sutras: Fine-Tuning Gemma 3 for Sanskrit Sandhi Splitting EMNLP 2025

Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language Models AACL 2025

LangSAMP: Language-Script Aware Multilingual Pretraining ACL 2025

Functional Lexicon in Subword Tokenization NAACL 2025

Challenges in Processing Chinese Texts Across Genres and Eras EMNLP 2025

Revisiting Word Embeddings in the LLM Era AACL 2025

Wikivecs: A Fully Reproducible Vectorization of Multilingual Wikipedia ACL 2025

Social Norms in Cinema: A Cross-Cultural Analysis of Shame, Pride and Prejudice NAACL 2025

Translationese-index: Using Likelihood Ratios for Graded and Generalizable Measurement of Translationese EMNLP 2025

Employing Discourse Coherence Enhancement to Improve Cross-Document Event and Entity Coreference Resolution ACL 2025

TNCSE: Tensor Norm Constraints for Unsupervised Contrastive Learning of Sentence Embeddings AAAI 2025

On the Relation Between Fine-Tuning, Topological Properties, and Task Performance in Sense-Enhanced Embeddings ACL 2025

SciNLP: A Domain-Specific Benchmark for Full-Text Scientific Entity and Relation Extraction in NLP EMNLP 2025

Towards Unified, Dynamic and Annotation-based Visualisations and Exploration of Annotated Big Data Corpora with the Help of Unified Corpus Explorer NAACL 2025

How much do contextualized representations encode long-range context? NAACL 2025

Tomato, Tomahto, Tomate: Do Multilingual Language Models Understand Based on Subword-Level Semantic Concepts? NAACL 2025

Beyond Benchmarks: Building a Richer Cross-Document Event Coreference Dataset with Decontextualization NAACL 2025

Attention on Multiword Expressions: A Multilingual Study of BERT-based Models with Regard to Idiomaticity and Microsyntax NAACL 2025

Word2Vec4Kids: Interactive Challenges to Introduce Middle School Students to Word Embeddings AAAI 2025

Cognitive Linguistic Identity Fusion Score (CLIFS): A Scalable Cognition‐Informed Approach to Quantifying Identity Fusion from Text EMNLP 2025

ChuenSumi at SemEval-2025 Task 1: Sentence Transformer Models and Processing Idiomacity SEMEVAL 2025

Retrieval of Parallelizable Texts Across Church Slavic Variants COLING 2025