Papers
The Emergence of Semantic Units in Massively Multilingual Models
Andrea Gregor de Varda, Marco Marelli
The Ethical Question – Use of Indigenous Corpora for Large Language Models
Linda Wiechetek, Flammie Pirinen, Maja Lisa Kappfjell et al.
The Extraction and Fine-grained Classification of Written Cantonese Materials through Linguistic Feature Detection
Chaak-ming Lau, Mingfei Lau, Ann Wai Huen To
The First Parallel Corpus and Neural Machine Translation Model of Western Armenian and English
Ari Nubar Boyacıoğlu, Jan Niehues
The First Universal Dependency Treebank for Tswana: Tswana-Popapolelo
Tanja Gaustad, Ansu Berg, Rigardt Pretorius et al.
The IgboAPI Dataset: Empowering Igbo Language Technologies through Multi-dialectal Enrichment
Chris Chinenye Emezue, Ifeoma Okoh, Chinedu Emmanuel Mbonu et al.
The Impact of Digital Editing on the Study of Holocaust Survivors’ Testimonies in the context of Voci dall’Inferno Project
Angelo Mario Del Grosso, Marina Riccucci, Elvira Mercatanti
The Impact of Stance Object Type on the Quality of Stance Detection
Maxwell A. Weinzierl, Sanda M. Harabagiu
The Influence of Automatic Speech Recognition on Linguistic Features and Automatic Alzheimer’s Disease Detection from Spontaneous Speech
Jonathan Heitz, Gerold Schneider, Nicolas Langer
The Key Points: Using Feature Importance to Identify Shortcomings in Sign Language Recognition Models
Ruth M. Holmes, Ellen Rushe, Anthony Ventresque
The Low Saxon LSDC Dataset at Universal Dependencies
Janine Siewert, Jack Rueter
The MEET Corpus: Collocated, Distant and Hybrid Three-party Meetings with a Ranking Task
Ghazaleh Esfandiari-Baiat, Jens Edlund
The Mental Lexicon of Communicative Fragments and Contours: The Remix N-gram Method
Emese K. Molnár, Andrea Dömötör
The MOLOR Lemma Bank: a New LLOD Resource for Old Irish
Theodorus Fransen, Cormac Anderson, Sacha Beniamine et al.
The Multilingual Corpus of World’s Constitutions (MCWC)
Mo El-Haj, Saad Ezzini
The Need for Grounding in LLM-based Dialogue Systems
Kristiina Jokinen
The Onomastic Repertoire of the Roman d’Alexandre (ORNARE). Designing an Integrated Digital Onomastic Tool for Medieval French Romance
Marta Milazzo, Giorgio Maria Di Nunzio
The Open-World Lottery Ticket Hypothesis for OOD Intent Classification
Yunhua Zhou, Pengyu Wang, Peiju Liu et al.
Theoretical and Empirical Advantages of Dense-Vector to One-Hot Encoding of Intent Classes in Open-World Scenarios
Paulo Cavalin, Claudio Santos Pinhanez
The ParCoLab Parallel Corpus and Its Extension to Four Regional Languages of France
Dejan Stosic, Saša Marjanović, Delphine Bernhard et al.
The ParlaSent Multilingual Training Dataset for Sentiment Identification in Parliamentary Proceedings
Michal Mochtak, Peter Rupnik, Nikola Ljubešić
The Relative Clauses AMR Parsers Hate Most
Xiulin Yang, Nathan Schneider
There’s Something New about the Italian Parliament: The IPSA Corpus
Valentino Frasnelli, Alessio Palmero Aprosio
The RIP Corpus of Collaborative Hypothesis-Making
Ella Schad, Jacky Visser, Chris Reed
The Rise and Fall of Dependency Parsing in Dante Alighieri’s Divine Comedy
Claudia Corbetta, Marco Passarotti, Giovanni Moretti