COCA is probably the most widely-used corpus of English, and it is related to many other corpora of English that we have created, which offer unparalleled insight into variation in English. The users can see how words, phrases and grammatical constructions have increased or decreased in frequency, how words have changed meaning over time, and how stylistic changes have taken place in the language. The COHA corpus is the largest structured corpus of historical American English, containing more than 400 million words of text of American English from 1810 to 2009. The Corpus of Historical American English (COHA) is the largest structured corpus of historical English. The corpus is composed of more than 400 million words of text in more than 100,000 individual texts. Zu den Texten des Korpus gehören fiktionale und nichtfiktionale Texte sowie Artikel aus Zeitungen und Magazinen, wobei die fiktionale Literatur etwa die Hälfte der gesamten Textmenge ausmacht.

The Corpus of Historical American English is a subscription dataset acquired by Purdue Libraries with limited use under their ACAD-2+ license for Professor Julia Rayz (CIT). Das Corpus of Historical American English (COHA) ist eines der am häufigsten verwendeten großen Korpora in diachronen Studien zum Englischen. Das CCOHA-Korpus kann über die COHA-Website heruntergeladen werden. COHA contains more than 400 million words of text from the 1810s-2000s (which makes it 50-100 times as large as other comparable historical corpora of English) and the corpus is balanced by genre decade by decade. Size: 385 million tokens Annotation: tokenised Licence: CLARN ACA English (American) This corpus contains texts from 1810 to 2009.

Clean Corpus of Historical American English (CCOHA)


