Sampo Pyysalo
23 papers · 2010–2025 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
π Academic Marathon (15) π§ Keyword Pioneer π Interdisciplinary Bridge π Conference Polyglot (9) π Cross-Pollinator (12)
πΊοΈ
Taxonomy Completionist
(35)
π£
Hot Topic Early Bird
π§
Keyword Pioneer
π₯
Mega-Team
(62)
π¬
Deep Specialist
(11)
π
Trend Setter
π
Century Club
(23)
ποΈ
Keyword Collector
(83)
π₯
Unstoppable
(7)
Conferences
COLING (7)
EMNLP (5)
ACL (4)
EACL (2)
CONLL (1)
IJCNLP (1)
JMLR (1)
NAACL (1)
NIPS (1)
Top co-authors
Keywords
language model
(5)
named entity recognition
(5)
dependency parsing
(4)
biomedical text
(3)
multilingual corpus
(3)
universal dependencies
(3)
machine translation
(2)
multilingual nlp
(2)
data repetition
(2)
large language model
(2)
multilingual model
(2)
parallel corpus
(2)
language modeling
(2)
multilingual parsing
(2)
word embedding
(2)
attention mechanism
(1)
semi-supervised learning
(1)
model evaluation
(1)
coreference resolution
(1)
cross-lingual transfer
(1)
Papers
Aurora-M: Open Source Continual Pre-training for Multilingual Language and Code
COLING 2025
Scaling Data-Constrained Language Models
JMLR 2025
An Expanded Massive Multilingual Dataset for High-Performance Language Technologies (HPLT)
ACL 2025
Building Question-Answer Data Using Web Register Identification
COLING 2024
A New Massive Multilingual Dataset for High-Performance Language Technologies
COLING 2024
Scaling Data-Constrained Language Models
NIPS 2023
Silver Syntax Pre-training for Cross-Domain Relation Extraction
ACL 2023
FinGPT: Large Generative Models for a Small Language
EMNLP 2023
Towards better structured and less noisy Web data: Oscar with Register annotations
COLING 2022
Beyond the English Web: Zero-Shot Cross-Lingual and Lightweight Monolingual Classification of Registers
EACL 2021
Exploring Cross-sentence Contexts for Named Entity Recognition with BERT
COLING 2020
Turku Enhanced Parser Pipeline: From Raw Text to Enhanced Graphs in the IWPT 2020 Shared Task
ACL 2020
The birth of Romanian BERT
EMNLP 2020
Neural Dependency Parsing of Biomedical Text: TurkuNLP entry in the CRAFT Structural Annotation Task
EMNLP 2019
Biomedical Named Entity Recognition with Multilingual BERT
EMNLP 2019
CRAFT Shared Tasks 2019 Overview β Integrated Structure, Semantics, and Coreference
EMNLP 2019
CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
CONLL 2017
Attending to Characters in Neural Sequence Labeling Models
COLING 2016
SETS: Scalable and Efficient Tree Search in Dependency Graphs
NAACL 2015
Sharing annotations better: RESTful Open Annotation
IJCNLP 2015
Sharing annotations better: RESTful Open Annotation
ACL 2015
brat: a Web-based Tool for NLP-Assisted Text Annotation
EACL 2012
Evaluating Dependency Representations for Event Extraction
COLING 2010