Creation of corpora and corpus linguistics

A corpus is generally defined as a large, structured and comprehensive collection of texts of a given language, professionally processed and stored in electronic form. Corpus managers are used to work with these corpora. In a properly compiled corpus, it is possible to easily search for terms and track linguistic phenomena, especially words and phrases (collocations), including the frequency of different phenomena and their usage. In a corpus, individual expressions or phenomena can be examined in their natural context, thus enabling data-driven linguistic research on a scale that would not be possible without digital technologies.

Language corpora are very useful both in the study of language itself and in the content analysis of literary or other works (song lyrics and songs) or in translation, where so-called parallel corpora are a very useful basis.

You are running an old browser version. We recommend updating your browser to its latest version.

More info