How do we know when, say, Early Modern period of a given language expires and Late Modern commences? Typically coarse-grained peri-odizations are based on changes of the grammatical system, whereas fine-grained ones take as an evidence some sociolinguistic or philological arguments. Instead we propose a corpus driven approach. Using text categorisation methods, in a stepwise fashion we divide a diachronic corpus into two, as different as possible, subcorpora (Eder & Górski 2016). This allows us for identification of quantitatively different stages in language development. The underlying assumption is that effective categorisation is possible only if two requirements are satisfied: there is a true difference (be it lexical or grammatical) between older and newer texts and the two subcorpora are homogeneous.
Podrobnosti události
- Začátek události
- 12. 10. 2016 17:30
- Místo konání
- hlavní budova, místnost č. 104
- Organizátor
- Ústav Českého národního korpusu
- Typ události
- Konference a přednášky
- Přílohy
- Program