Corpus of Historical American English

Project member? Login to members area.


The Corpus of Historical American English is a subscription dataset acquired by Purdue Libraries with limited use under their ACAD-2+ license for Professor Julia Rayz (CIT). All users who are granted access must read and agree the Restrictions on Use PDF in the project before using any data. Data are uniquely marked for tracking purposes and may not be shared outside of the project.

The Corpus of Historical American English (COHA) is the largest structured corpus of historical English. It offers unparalleled insight into variation in the English language. COHA contains more than 400 million words of text from the 1810s-2000s (which makes it 50-100 times as large as other comparable historical corpora of English) and the corpus is balanced by genre decade by decade. 

The subscription is managed by Michael Witt and Robert Freeman at Purdue Libraries. For more information, contact

The Purdue University Research Repository (PURR) is a university core research facility provided by the Purdue University Libraries, the Office of the Executive Vice President for Research and Partnerships, and Information Technology at Purdue (ITaP).