Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Resources & Methods
Natural Language Processing
›
Resources & Methods
›
Text Representation
2246 directly classified papers
Papers per year
2006: 2
2007: 4
2008: 1
2009: 6
2010: 2
2011: 3
2012: 3
2013: 7
2014: 7
2015: 4
2016: 30
2017: 126
2018: 177
2019: 231
2020: 245
2021: 296
2022: 240
2023: 210
2024: 292
2025: 297
2026: 63
Papers
CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization
AAAI 2025
Sinhala Encoder-only Language Models and Evaluation
ACL 2025
Large Vocabulary Size Improves Large Language Models
ACL 2025
EnerGIZAr: Leveraging GIZA++ for Effective Tokenizer Initialization
ACL 2025
TEXT-CAKE: Challenging Language Models on Local Text Coherence
COLING 2025
Fine-Grained Change Point Detection for Topic Modeling with Pitman-Yor Process
JMLR 2025
Embedding Style Beyond Topics: Analyzing Dispersion Effects Across Different Language Models
COLING 2025
Fact Recall, Heuristics or Pure Guesswork? Precise Interpretations of Language Models for Fact Completion
ACL 2025
How do Transformer Embeddings Represent Compositions? A Functional Analysis
ACL 2025
Geometric Signatures of Compositionality Across a Language Model’s Lifetime
ACL 2025
Sticking to the Mean: Detecting Sticky Tokens in Text Embedding Models
ACL 2025
Beyond Text Compression: Evaluating Tokenizers Across Scales
ACL 2025
CoMuMDR: Code-mixed Multi-modal Multi-domain corpus for Discourse paRsing in conversations
ACL 2025
Assessing Critical Thinking Components in Romanian Secondary School Textbooks: A Data Mining Approach to the ROTEX Corpus
ACL 2025
Demystifying optimized prompts in language models
EMNLP 2025
Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning
EMNLP 2025
Formalizing Style in Personal Narratives
EMNLP 2025
A Text is Worth Several Tokens: Text Embedding from LLMs Secretly Aligns Well with The Key Tokens
ACL 2025
LuxEmbedder: A Cross-Lingual Approach to Enhanced Luxembourgish Sentence Embeddings
COLING 2025
Scaffold-BPE: Enhancing Byte Pair Encoding for Large Language Models with Simple and Effective Scaffold Token Removal
AAAI 2025
AutoChunker: Structured Text Chunking and its Evaluation
ACL 2025
TransBERT: A Framework for Synthetic Translation in Domain-Specific Language Modeling
EMNLP 2025
Lemmatization of Polish Multi-word Expressions
EMNLP 2025
Conditional Dichotomy Quantification via Geometric Embedding
ACL 2025
Towards Ancient Meroitic Decipherment: A Computational Approach
NAACL 2025
<
1
…
12
13
14
…
90
>