Maarten Janssen: TEITOK, a web-based platform for viewing, creating, and editing corpora

In this talk I will give a general overview of TEITOK, an online system for making corpora available and searchable, but at the same time for editing them, annotating, and correcting. In TEITOK, a corpus consists of a collection of heavily annotated, Text-Encoding Initiative (TEI) compliant XML files, each of which can be edited individually. The files can contain not only the corpus text, but also a wide range of annotation data, concerning many aspect of the text, including its relation to sound files or facsimile images. This allows for coordinate-sensitive document descriptions, time-aligned audio transcriptions, or multilayered transcriptions. I will show how this makes TEITOK a powerful tool for at least the three areas where it is most used: learner corpora, historical corpora, and spoken corpora.

Podrobnosti události

Začátek události
12. 3. 2019 13:00
Místo konání
Panská 7, Praha, Ústav českého národního korpusu FF UK (room 5)
Organizátor
Ústav českého národního korpusu FF UK
Typ události
Konference a přednášky
Přílohy
Program