Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Resources & Methods
Natural Language Processing
›
Resources & Methods
›
Text Representation
2246 directly classified papers
Papers per year
2006: 2
2007: 4
2008: 1
2009: 6
2010: 2
2011: 3
2012: 3
2013: 7
2014: 7
2015: 4
2016: 30
2017: 126
2018: 177
2019: 231
2020: 245
2021: 296
2022: 240
2023: 210
2024: 292
2025: 297
2026: 63
Papers
Bridging Dialectal Variation: A Phonetic Transcription Tool for Tamil
EACL 2026
Surprisal and Metaphor Novelty Judgments: Moderate Correlations and Divergent Scaling Effects Revealed by Corpus-Based and Synthetic Datasets
EACL 2026
Cosine Similarity as Logits?: A Scalable Knowledge Probe Using Embedding Vectors from Generative Language Models
EACL 2026
elfen: A Python Package for Efficient Linguistic Feature Extraction for Natural Language Datasets
EACL 2026
PUCP-Metrix: An Open-source and Comprehensive Toolkit for Linguistic Analysis of Spanish Texts
EACL 2026
Evaluating Morphological Plausibility of Subword Tokenization via Statistical Alignment with Morpho-Syntactic Features
EACL 2026
Attribute-Controlled Translation with Preference Optimization
EACL 2026
BhashaKritika: Building Synthetic Pretraining Data at Scale for Indic Languages
AAAI 2026
Measuring Idiomaticity in Text Embedding Models with epsilon-compositionality
EACL 2026
AdaptBPE: From General Purpose to Specialized Tokenizers
EACL 2026
Is Information Density Uniform when Utterances are Grounded on Perception and Discourse?
EACL 2026
CASE – Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement
EACL 2026
The Token Tax: Systematic Bias in Multilingual Tokenization
EACL 2026
Paragraph-level Error Correction and Explanation Generation: Case Study for Estonian
ACL 2025
Searchable Language Documentation Corpora: DoReCo meets TEITOK
ACL 2025
The Elephant in the Coreference Room: Resolving Coreference in Full-Length French Fiction Works
EMNLP 2025
When Does Meaning Backfire? Investigating the Role of AMRs in NLI
EMNLP 2025
HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics
EMNLP 2025
How Do Large Language Models Evaluate Lexical Complexity?
EMNLP 2025
The Gemma Sutras: Fine-Tuning Gemma 3 for Sanskrit Sandhi Splitting
EMNLP 2025
False Friends Are Not Foes: Investigating Vocabulary Overlap in Multilingual Language Models
EMNLP 2025
Intelligent Document Parsing: Towards End-to-end Document Parsing via Decoupled Content Parsing and Layout Grounding
EMNLP 2025
Evaluating Textual and Visual Semantic Neighborhoods of Abstract and Concrete Concepts
EMNLP 2025
Lemmatization of Polish Multi-word Expressions
EMNLP 2025
Fix-CLIP: Dual-Branch Hierarchical Contrastive Learning via Synthetic Captions for Better Understanding of Long Text
ICCV 2025
<
1
2
3
4
5
…
90
>