A linguistic corpus is a set of representative written or oral texts, often with linguistic annotations, for use as data in certain types of linguistic analysis. These texts are usually selected through a method of sampling and are meant to be representative of a certain form of a language, either in a certain time period or over time.

This collection in the UNT Digital Library contains a number of corpora (the plural of "corpus") that have been licensed for use by members of the UNT community. For more on linguistic corpora and tools for working with linguistic data, see the linguistics subject guide, which includes contact information for UNT's subject librarian for linguistics.



At a Glance



Cite This Collection

Here is our suggested citation. Consult an appropriate style guide for conformance to specific guidelines.

Linguistic Corpora in UNT Digital Library. University of North Texas Libraries. https://digital.library.unt.edu/explore/collections/LINGC/ accessed June 7, 2020.