Maarten Janssen: TEITOK, a web-based platform for viewing, creating, and editing corpora

In this talk I will give a general overview of TEITOK, an online system for making corpora available and searchable, but at the same time for editing them, annotating, and correcting. In TEITOK, a corpus consists of a collection of heavily annotated, Text-Encoding Initiative (TEI) compliant XML files, each of which can be edited individually. The files can contain not only the corpus text, but also a wide range of annotation data, concerning many aspect of the text, including its relation to sound files or facsimile images. This allows for coordinate-sensitive document descriptions, time-aligned audio transcriptions, or multilayered transcriptions. I will show how this makes TEITOK a powerful tool for at least the three areas where it is most used: learner corpora, historical corpora, and spoken corpora.

Podrobnosti události

Začátek události
12. 3. 2019 13:00: Místo konání
Panská 7, Praha, Ústav českého národního korpusu FF UK (room 5): Organizátor
Ústav českého národního korpusu FF UK: Typ události
Konference a přednášky: Přílohy
Program

Typy akcí

Maarten Janssen: TEITOK, a web-based platform for viewing, creating, and editing corpora

Podrobnosti události

Důležité odkazy

Rychlé kontakty