2021 EACL EACL 2021

Comparing the Performance of CNNs and Shallow Models for Language Identification

Abstract

AbstractIn this work we compare the performance of convolutional neural networks and shallow models on three out of the four language identification shared tasks proposed in the VarDial Evaluation Campaign 2021. In our experiments, convolutional neural networks and shallow models yielded comparable performance in the Romanian Dialect Identification (RDI) and the Dravidian Language Identification (DLI) shared tasks, after the training data was augmented, while an ensemble of support vector machines and Naïve Bayes models was the best performing model in the Uralic Language Identification (ULI) task. While the deep learning models did not achieve state-of-the-art performance at the tasks and tended to overfit the data, the ensemble method was one of two methods that beat the existing baseline for the first track of the ULI shared task.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors