Podcast
Questions and Answers
What is the primary purpose of concordance in corpus linguistics?
What is the primary purpose of concordance in corpus linguistics?
Which of the following is NOT a characteristic of a corpus?
Which of the following is NOT a characteristic of a corpus?
What term describes the analysis of language data using statistical methods to identify patterns?
What term describes the analysis of language data using statistical methods to identify patterns?
Which of the following best describes the qualitative approach to corpus linguistics?
Which of the following best describes the qualitative approach to corpus linguistics?
Signup and view all the answers
What is the purpose of POS (Part-of-Speech) tagging in corpus linguistics?
What is the purpose of POS (Part-of-Speech) tagging in corpus linguistics?
Signup and view all the answers
What is the significance of the example 'outdoor/outdooring (V/N): the bringing ‘out of doors’ of a child after seven days.' in the lecture?
What is the significance of the example 'outdoor/outdooring (V/N): the bringing ‘out of doors’ of a child after seven days.' in the lecture?
Signup and view all the answers
What is the relationship between quantitative and qualitative approaches in corpus linguistics?
What is the relationship between quantitative and qualitative approaches in corpus linguistics?
Signup and view all the answers
Which of the following are popular corpus families used in corpus linguistics? (Select all that apply)
Which of the following are popular corpus families used in corpus linguistics? (Select all that apply)
Signup and view all the answers
What is the main point of the provided text?
What is the main point of the provided text?
Signup and view all the answers
What is the significance of the word "outdooring" being used for various events such as political party launches and product launches?
What is the significance of the word "outdooring" being used for various events such as political party launches and product launches?
Signup and view all the answers
What does the text suggest about the term "outdooring" in relation to other English-speaking countries besides Ghana?
What does the text suggest about the term "outdooring" in relation to other English-speaking countries besides Ghana?
Signup and view all the answers
What is the most significant difference between 'types' and 'tokens' in the context of language analysis?
What is the most significant difference between 'types' and 'tokens' in the context of language analysis?
Signup and view all the answers
What is the approximate type-token ratio (TTR) of the ICE GB corpus?
What is the approximate type-token ratio (TTR) of the ICE GB corpus?
Signup and view all the answers
What is a "frequency analysis"?
What is a "frequency analysis"?
Signup and view all the answers
How does the size of a corpus influence the type-token ratio (TTR)?
How does the size of a corpus influence the type-token ratio (TTR)?
Signup and view all the answers
Which of the following is NOT mentioned as an example of how the word "outdooring" has been used?
Which of the following is NOT mentioned as an example of how the word "outdooring" has been used?
Signup and view all the answers
Why is it important to normalize frequency data when comparing frequencies across different corpora?
Why is it important to normalize frequency data when comparing frequencies across different corpora?
Signup and view all the answers
What is the purpose of comparing the frequency of "outdooring" in GloWbE and NOW corpora?
What is the purpose of comparing the frequency of "outdooring" in GloWbE and NOW corpora?
Signup and view all the answers
What does 'per-X-word frequency' refer to?
What does 'per-X-word frequency' refer to?
Signup and view all the answers
What is the significance of the phrase "semantic shift" as used in the text?
What is the significance of the phrase "semantic shift" as used in the text?
Signup and view all the answers
What type of data is primarily used to analyze the term "outdooring" in the provided text?
What type of data is primarily used to analyze the term "outdooring" in the provided text?
Signup and view all the answers
What is the purpose of calculating 'per-million-word' (pmw) frequencies?
What is the purpose of calculating 'per-million-word' (pmw) frequencies?
Signup and view all the answers
Which statement accurately describes the COHA corpus?
Which statement accurately describes the COHA corpus?
Signup and view all the answers
What is the approximate size of the ICE GB spoken corpus?
What is the approximate size of the ICE GB spoken corpus?
Signup and view all the answers
Which variety of English is expected to exhibit the strongest influence from American English?
Which variety of English is expected to exhibit the strongest influence from American English?
Signup and view all the answers
What is the primary focus of lists that compare British and American English words?
What is the primary focus of lists that compare British and American English words?
Signup and view all the answers
In GloWbE, how can one filter to find only nominal uses of a word?
In GloWbE, how can one filter to find only nominal uses of a word?
Signup and view all the answers
Which word represents the British English term for 'French fries'?
Which word represents the British English term for 'French fries'?
Signup and view all the answers
What is a challenge in researching terms like 'chips' and 'crisps'?
What is a challenge in researching terms like 'chips' and 'crisps'?
Signup and view all the answers
Which component of GloWbE is larger compared to JM or TZ components?
Which component of GloWbE is larger compared to JM or TZ components?
Signup and view all the answers
What adjective might accompany the noun 'aubergine' in English usage?
What adjective might accompany the noun 'aubergine' in English usage?
Signup and view all the answers
Which of the following regions is least likely to show American English influence?
Which of the following regions is least likely to show American English influence?
Signup and view all the answers
What is the key difference between a 'type' and a 'token' in corpus linguistics?
What is the key difference between a 'type' and a 'token' in corpus linguistics?
Signup and view all the answers
When studying the impact of American English on other varieties of English, what is the term used to describe the process by which words or phrases from one variety are adopted into another?
When studying the impact of American English on other varieties of English, what is the term used to describe the process by which words or phrases from one variety are adopted into another?
Signup and view all the answers
What type of data analysis focuses on the frequency of words and their occurrences in a text?
What type of data analysis focuses on the frequency of words and their occurrences in a text?
Signup and view all the answers
What does the acronym 'PMW' stand for in the context of language studies?
What does the acronym 'PMW' stand for in the context of language studies?
Signup and view all the answers
What does the Type/Token Ratio (TTR) measure?
What does the Type/Token Ratio (TTR) measure?
Signup and view all the answers
What is the purpose of the 'Frequency' section?
What is the purpose of the 'Frequency' section?
Signup and view all the answers
What is a 'collocation'?
What is a 'collocation'?
Signup and view all the answers
What is the purpose of setting a 'minimum collocate frequency' in AntConc?
What is the purpose of setting a 'minimum collocate frequency' in AntConc?
Signup and view all the answers
What is a key argument presented regarding the 'Americanization of English'?
What is a key argument presented regarding the 'Americanization of English'?
Signup and view all the answers
Identify a factor contributing to the 'Americanization of English' based on the provided content.
Identify a factor contributing to the 'Americanization of English' based on the provided content.
Signup and view all the answers
What is the primary purpose of the 'Case Studies' section?
What is the primary purpose of the 'Case Studies' section?
Signup and view all the answers
Based on the content, what is a potential reason for the increasing influence of American English?
Based on the content, what is a potential reason for the increasing influence of American English?
Signup and view all the answers
The provided information focuses primarily on the analysis of:
The provided information focuses primarily on the analysis of:
Signup and view all the answers
Flashcards
Corpus Linguistics
Corpus Linguistics
The empirical analysis of authentic language data using structured collections of linguistic data (corpora).
Corpus
Corpus
A collection of linguistic data compiled with a specific design, often structured and annotated.
Corpus Typology
Corpus Typology
Classification of corpora based on features like written/spoken or synchronic/diachronic.
Concordancing
Concordancing
Signup and view all the flashcards
AntConc
AntConc
Signup and view all the flashcards
Part-of-Speech (POS) Tagging
Part-of-Speech (POS) Tagging
Signup and view all the flashcards
Quantitative Approach
Quantitative Approach
Signup and view all the flashcards
Qualitative Approach
Qualitative Approach
Signup and view all the flashcards
Outdooring
Outdooring
Signup and view all the flashcards
GloWbE
GloWbE
Signup and view all the flashcards
Frequency Analysis
Frequency Analysis
Signup and view all the flashcards
Semantic Shift
Semantic Shift
Signup and view all the flashcards
Libation
Libation
Signup and view all the flashcards
Outdoored (verb)
Outdoored (verb)
Signup and view all the flashcards
Meanings in Context
Meanings in Context
Signup and view all the flashcards
Americanisation
Americanisation
Signup and view all the flashcards
Collocation
Collocation
Signup and view all the flashcards
Type/Token Ratio (TTR)
Type/Token Ratio (TTR)
Signup and view all the flashcards
Qualitative vs Quantitative
Qualitative vs Quantitative
Signup and view all the flashcards
Word frequency list
Word frequency list
Signup and view all the flashcards
Minimum collocate frequency
Minimum collocate frequency
Signup and view all the flashcards
Window span
Window span
Signup and view all the flashcards
Americani[sz]ation of English
Americani[sz]ation of English
Signup and view all the flashcards
Pingo
Pingo
Signup and view all the flashcards
Globalisation in language
Globalisation in language
Signup and view all the flashcards
Predictable combinations
Predictable combinations
Signup and view all the flashcards
American English Forms
American English Forms
Signup and view all the flashcards
Canadian English
Canadian English
Signup and view all the flashcards
Philippine English
Philippine English
Signup and view all the flashcards
British vs American English Lexis
British vs American English Lexis
Signup and view all the flashcards
Nominal Uses
Nominal Uses
Signup and view all the flashcards
Normalized Frequencies
Normalized Frequencies
Signup and view all the flashcards
Plural Forms Analysis
Plural Forms Analysis
Signup and view all the flashcards
Token
Token
Signup and view all the flashcards
Type
Type
Signup and view all the flashcards
Lexical Variation
Lexical Variation
Signup and view all the flashcards
Normalization
Normalization
Signup and view all the flashcards
Per-million-word frequency (pmw)
Per-million-word frequency (pmw)
Signup and view all the flashcards
Per-thousand-word frequency (ptw)
Per-thousand-word frequency (ptw)
Signup and view all the flashcards
Corpus Size Effect on TTR
Corpus Size Effect on TTR
Signup and view all the flashcards
Study Notes
Corpus Linguistics (2)
- Corpus linguistics is the empirical analysis of authentic language data using corpora
- A corpus is a collection of linguistic data, compiled following a design, often structured and annotated
- Corpus typology can include written/spoken, synchronic/diachronic data
- Frequently used corpus families include BROWN, ICE, COCA, BNC, GloWbE, and NOW
- Concordance is a method in corpus linguistics to highlight keywords in context
- AntConc is frequently used software in corpus linguistics
- Part-of-speech (POS) tagging provides grammatical information in a corpus
Today's Lecture
- Frequency
- Collocations
- Case Studies
1 Frequency: Quantitative and Qualitative Data Analysis
- Concordances are central to corpus linguistics
- Quantitative approaches count occurrences, compare frequencies, and use statistics to find patterns
- Qualitative approaches focus on how often a feature occurs, but rather to identify and describe language usage in context
- Data used as a basis for describing language usage providing real-life examples of phenomena
1 Frequency: Quantitative Data Analysis
-
'Outdooring' (V/N): the bringing 'out of doors' of a child after seven days
- GloWbE: 72 hits (1 from Canada, 71 from Ghana)
- NOW: 339 hits (1 each from Canada, Nigeria, South Africa and US; 336 from Ghana)
-
'Outdoored' (verb)
- GloWbE: 64 hits (1 from Nigeria, 63 from Ghana)
- NOW: 438 hits (1 each from Kenya, Nigeria, 6 from South Africa, 430 from Ghana)
-
Good evidence that the tradition of 'outdooring' is very Ghanaian
-
The 'outdooring' and naming ceremony starts when a family elder pours libation
-
A child is considered human after being outdoored
-
NDC (National Democratic Congress) succeeded since its outdooring in Cape Coast in 1992
-
African Union (AU) replaced the Organisation of African Unity (OAU) in 2002
-
Frequency analysis among the most common in corpus linguistics
-
Find all instances of a construction (e.g., in GloWbE)
-
Find words and spelling variations
-
All word forms for the lemma 'GIVE'
-
Frequency of fixed expressions (e.g., merry Christmas vs. happy Christmas)
-
Comparison of frequencies across varieties and over time (e.g., GloWbE; "GO on holiday" vs. "GO on vacation," COHA "telephone" vs. "phone")
1 Frequency: Types and Tokens
- Token: total number of words/constructions in a text, corpus or sub-corpus
- Type: total number of different words/ constructions in a text, corpus, or sub-corpus
- ICE GB: 1,071,926 tokens and 34,421 types
- Each sample contains ~2,000 words, with complete sentences
- Tags are included in the token count
1 Frequency: Type-Token Ratio (TTR)
- TTR: a measure of diversification
- Calculated as the number of types divided by the number of tokens, multiplied by 100
- TTR for ICE GB is approximately 3.2
- TTR is strongly dependent on corpus size, usually higher in smaller corpora
1 Frequency: Comparing Frequencies
-
Corpora often have different sizes
- Compare frequencies, normalize data
- Calculate per-X-word frequency
- per-million-word (pmw), per-thousand-word (ptw)
-
Example: Compare frequency of "Scotland" in spoken vs. written sections of ICE-GB
Activity 2: Comparing Frequencies
- COHA search for "television" yields results by decade
2 Collocations: Introduction
- Collocations: predictable word combinations
- How to determine predictablility? Native speaker intuition & learned construction
- Examples include "fast food" vs. "slow food," "quick food" vs. "unhurried food," "quick shower" vs. "fast shower"
2 Collocations: AntConc
- Create word frequency list in the Word list tab
- Enter search term (word, construction or regular expression) in the Collocates tab
- Set the window span (e.g., 1L, 3R)
- Choose minimum collocate frequency (n)
3 Case Studies: The Americanization of English
- Many factors attribute to American English gaining prominence (e.g., globalization, popular culture, media exposure)
- American English usage observed in other varieties
- Likely strongest in Canada and Philippines, least likely in Great Britain
3 Case Studies: The Americanization of English - Lexis
- Dozens of words list differing between British and American English
- Categorical lists for concepts (cf. OALD)
- Examples include "chips" vs. "fries," "crisp" vs. "chip", "aubergine" vs. "eggplant"
3 Case Studies: The Americanization of English - Lexis (GloWbE)
- Include nouns only in analysis
- Extend analysis to include plural forms, capitalized terms
- Examine normalized frequencies (GloWbE components - US/GB/others)
3 Case Studies: The Americanization of English - Past Tense
- Examine past tense forms using GloWbE
Keywords
- Americanization
- Collocation
- Construction
- Frequency
- PMW
- Qualitative
- Quantitative
- Semantic shift
- Token
- Type
- Type/token ratio (TTR)
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
Test your understanding of key concepts in corpus linguistics with this quiz. It covers various aspects, including the purpose of concordance, analysis methods, and characteristics of a corpus. Perfect for students and enthusiasts of linguistics.