Detailed error annotation for morphologically rich languages: Latvian use case

Roberts Dargis (Coresponding Author), Ilze Auzina, Kristne Levane-Petrova, Inga Kaija

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper presents a detailed error annotation for morphologically rich languages. The described approach is used to create Latvian Language Learner corpus (LaVA) which is part of a currently ongoing project Development of Learner corpus of Latvian: methods, tools and applications. There is no need for an advanced multi-token error annotation schema, because error annotated texts are written by beginner level (A1 and A2) who use simple syntactic structures. This schema focuses on in-depth categorization of spelling and word formation errors. The annotation schema will work best for languages with relatively free word order and rich morphology.
Original languageEnglish
Title of host publicationHuman Language Technologies - The Baltic Perspective - Proceedings of the 9th International Conference Baltic HLT 2020
EditorsAndrius Utka, Jurgita Vaicenoniene, Jolanta Kovalevskaite, Danguole Kalinauskaite
PublisherIOS Press BV
Pages241-244
ISBN (Electronic)9781643681160
ISBN (Print)9781643681160
DOIs
Publication statusPublished - 15 Sep 2020

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume328
ISSN (Print)0922-6389

Keywords

  • Corpus development
  • Error annotation
  • Language acquisition
  • Leaner corpus

Field of Science

  • 6.2 Languages and Literature

Publication Type

  • 3.1. Articles or chapters in proceedings/scientific books indexed in Web of Science and/or Scopus database

Fingerprint Dive into the research topics of 'Detailed error annotation for morphologically rich languages: Latvian use case'. Together they form a unique fingerprint.

Cite this