Corpora in Linguistics

PleasingRationality5865 avatar
PleasingRationality5865
·
·
Download

Start Quiz

Study Flashcards

16 Questions

What is the main characteristic of a general corpus?

It represents language in its broadest sense.

What is the purpose of a reference corpus?

To produce reference materials for language learning.

What is the main difference between a general corpus and a specialized corpus?

The type of texts included.

What is the purpose of a diachronic corpus?

To trace the development of a language over time.

What is the name of the corpus that includes informal registers of British English?

CANCODE

How many words does the BNC corpus contain?

100 million words

What is the name of the corpus that includes spoken registers in a US academic setting?

MICASE

What is the main characteristic of a specialized corpus?

It is designed with more specific research goals in mind.

What is the primary purpose of Learner's Corpora?

To identify differences among learners and native speakers

What is the characteristic of Comparable Corpora?

They are designed along the same lines with the same proportions of texts

What is the primary purpose of Parallel Corpora?

To find potential equivalent expressions in each language

What is the name of the Corpus that contains 1.5 million words?

Helsinki Corpus

What is the primary purpose of Regional Corpora?

To represent a regional variety of a language

What is the characteristic of the International Corpus of English (ICE)?

It is a comparable corpus of different varieties of English

What is the size of the International Corpus of Learner English (ICLE)?

20,000 words

What can Comparable Corpora be used for?

To compare different varieties of the same language

Study Notes

Corpora

  • A corpus is a large, structured set of electronically stored texts used for statistical analysis, hypothesis testing, and linguistic rule validation.

Types of Corpora

  • General Corpora: Representative of language in its broadest sense, serving as a baseline for comparative studies of general linguistic features.
    • Examples: BROWN Corpus (1 million words), LOB Corpus (1 million words), BNC (British National Corpus, 100 million words)
  • Specialized Corpora: Designed with specific research goals in mind, focusing on a particular type of text, register, or language variety.
    • Examples: CANCODE (Cambridge and Nottingham Corpus of Discourse in English, 5 million words), MICASE (Michigan Corpus of Academic Spoken English, 5 million words)
  • Historical or Diachronic Corpora: Texts from different time periods, aiming to represent a stage or stages of a language's development.
    • Example: Helsinki Corpus (700-1700 texts, 1.5 million words)
  • Regional Corpora: Representative of a regional variety of a language, such as dialects.
  • Learner's Corpora: Representative of language produced by learners, including spoken and written language samples from non-native speakers.
    • Examples: Louvain English Essays (LOCNEE) Corpus, International Corpus of Learner English (ICLE, 20,000 words)
  • Comparable Corpora: Two or more corpora in different languages or varieties of a language, designed along the same lines.
    • Example: International Corpus of English (ICE, 1 million words each of different varieties of English)
  • Parallel Corpora: Two or more corpora in different languages, each containing translated texts or simultaneously produced texts in multiple languages.
    • Used by translators and learners to identify equivalent expressions and investigate linguistic differences.

This quiz covers the basics of corpora in linguistics, including types of corpora and their applications.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free

More Quizzes Like This

Modern Corpus Linguistics Overview
6 questions
Use Quizgecko on...
Browser
Browser