Podcast
Questions and Answers
Which of the following is the MOST accurate description of how corpus linguistics approaches the study of language?
Which of the following is the MOST accurate description of how corpus linguistics approaches the study of language?
- It relies on theoretical constructs without empirical validation.
- It primarily focuses on establishing prescriptive rules for correct language usage.
- It prioritizes subjective interpretations of language over objective analysis.
- It analyzes language as it is naturally used, regardless of prescriptive norms. (correct)
In corpus linguistics, the continuous form of verbs like 'love' is frequently observed in various contexts.
In corpus linguistics, the continuous form of verbs like 'love' is frequently observed in various contexts.
False (B)
Explain how corpus comparison can be utilized to identify specialized terminology within particular fields.
Explain how corpus comparison can be utilized to identify specialized terminology within particular fields.
By comparing a general corpus with a specialized corpus to identify terms that are significantly more frequent in the specialized corpus.
In corpus linguistics, the study of multi-word units is known as ______.
In corpus linguistics, the study of multi-word units is known as ______.
Match each corpus linguistics tool with its primary function:
Match each corpus linguistics tool with its primary function:
Which of the following examples demonstrates a common deviation from prescriptive grammar that corpus linguistics would still consider?
Which of the following examples demonstrates a common deviation from prescriptive grammar that corpus linguistics would still consider?
What is the significance of collocations in corpus linguistics?
What is the significance of collocations in corpus linguistics?
Describe how corpus linguistics can be applied in the field of lexicogrammar.
Describe how corpus linguistics can be applied in the field of lexicogrammar.
Which 'Concordance' tool search function is best for finding words with a defined ending and unlimited preceding characters?
Which 'Concordance' tool search function is best for finding words with a defined ending and unlimited preceding characters?
Macros in corpus analysis are used to exclude specific documents or parts of a corpus.
Macros in corpus analysis are used to exclude specific documents or parts of a corpus.
Which of the following statements accurately distinguishes Sketch Engine from #LancsBox regarding corpus analysis?
Which of the following statements accurately distinguishes Sketch Engine from #LancsBox regarding corpus analysis?
What is the primary function of a POS tag in an annotated corpus?
What is the primary function of a POS tag in an annotated corpus?
The 'Wordlist' tool generates frequency lists of words, lemmas, nouns, verbs, and ______.
The 'Wordlist' tool generates frequency lists of words, lemmas, nouns, verbs, and ______.
When using the 'Lemma' query type in Sketch Engine's concordance tool, the search will only return the base form of the word, excluding any inflected forms.
When using the 'Lemma' query type in Sketch Engine's concordance tool, the search will only return the base form of the word, excluding any inflected forms.
Match the following 'Concordance' tool functions with their descriptions:
Match the following 'Concordance' tool functions with their descriptions:
Explain how the 'Character' query type in Sketch Engine's concordance tool differs from other query types in its handling of punctuation.
Explain how the 'Character' query type in Sketch Engine's concordance tool differs from other query types in its handling of punctuation.
The query type in Sketch Engine's concordance tool that allows for the use of regular expressions and part-of-speech tags for complex searches is called ______.
The query type in Sketch Engine's concordance tool that allows for the use of regular expressions and part-of-speech tags for complex searches is called ______.
Which action configures the 'Wordlist' tool to generate a list showing the most frequent word forms in its default setting?
Which action configures the 'Wordlist' tool to generate a list showing the most frequent word forms in its default setting?
Which of the following best describes the purpose of the 'Filter context' function in corpus analysis?
Which of the following best describes the purpose of the 'Filter context' function in corpus analysis?
Match each Sketch Engine concordance query type with its corresponding functionality.
Match each Sketch Engine concordance query type with its corresponding functionality.
The 'Wordlist' tool can only generate frequency lists of word forms, and cannot be used for lemmas or POS tags.
The 'Wordlist' tool can only generate frequency lists of word forms, and cannot be used for lemmas or POS tags.
A researcher wants to find all instances of the word "run" regardless of its form (e.g., "runs", "running", "ran"). Which concordance query type in Sketch Engine would be most suitable?
A researcher wants to find all instances of the word "run" regardless of its form (e.g., "runs", "running", "ran"). Which concordance query type in Sketch Engine would be most suitable?
What is a primary limitation of using a concordance tool with very large corpora, despite its power?
What is a primary limitation of using a concordance tool with very large corpora, despite its power?
The 'Simple' query type in Sketch Engine will only find the exact word form entered by the user.
The 'Simple' query type in Sketch Engine will only find the exact word form entered by the user.
Which criterion is NOT essential for a collection of texts to be considered a corpus?
Which criterion is NOT essential for a collection of texts to be considered a corpus?
Texts written specifically for the purpose of being included in a corpus, such as eliciting specific grammatical structures, are considered valid components of a corpus.
Texts written specifically for the purpose of being included in a corpus, such as eliciting specific grammatical structures, are considered valid components of a corpus.
Briefly explain why authenticity is crucial to the definition of a 'corpus'.
Briefly explain why authenticity is crucial to the definition of a 'corpus'.
Corpus linguistics involves a system of methods and principles for applying corpora in language studies and ______.
Corpus linguistics involves a system of methods and principles for applying corpora in language studies and ______.
Match each characteristic to its corresponding term:
Match each characteristic to its corresponding term:
Which of the following statements best characterizes corpus linguistics?
Which of the following statements best characterizes corpus linguistics?
Which is the LEAST representative genre for inclusion in a corpus designed to study natural language use?
Which is the LEAST representative genre for inclusion in a corpus designed to study natural language use?
Explain how the definition of a 'corpus' ensures that linguistic analyses based on corpora are ecologically valid.
Explain how the definition of a 'corpus' ensures that linguistic analyses based on corpora are ecologically valid.
Which of the following statements best describes the relationship between lexicon and grammar according to the presented perspective?
Which of the following statements best describes the relationship between lexicon and grammar according to the presented perspective?
The verb 'love' is commonly used in continuous aspect, reflecting its dynamic and ongoing nature.
The verb 'love' is commonly used in continuous aspect, reflecting its dynamic and ongoing nature.
Briefly explain how a usage-based approach to language pedagogy integrates corpora.
Briefly explain how a usage-based approach to language pedagogy integrates corpora.
Phraseological items consist of multi-word units that are often __________, combining lexical and grammatical components uniquely.
Phraseological items consist of multi-word units that are often __________, combining lexical and grammatical components uniquely.
Match the benefit of integrating corpus use into the EFL classroom with its description:
Match the benefit of integrating corpus use into the EFL classroom with its description:
What is a key characteristic of usage-based approaches to describing and teaching languages?
What is a key characteristic of usage-based approaches to describing and teaching languages?
Explain the process by which learners move from prototypical constructions to abstraction in language acquisition, according to the principles outlined.
Explain the process by which learners move from prototypical constructions to abstraction in language acquisition, according to the principles outlined.
What is the primary implication of the statistic that the verb 'smile' is predominantly used in the past simple tense (64% of occurrences in the BNC)?
What is the primary implication of the statistic that the verb 'smile' is predominantly used in the past simple tense (64% of occurrences in the BNC)?
Which tool in Sketch Engine is most effective for directly comparing the collocational patterns of 'random' and 'arbitrary'?
Which tool in Sketch Engine is most effective for directly comparing the collocational patterns of 'random' and 'arbitrary'?
Using authentic language material in worksheets, as opposed to invented sentences, is suggested to make language learning more meaningful.
Using authentic language material in worksheets, as opposed to invented sentences, is suggested to make language learning more meaningful.
When redesigning language learning worksheets, what is a key consideration to enhance their effectiveness, besides using authentic materials?
When redesigning language learning worksheets, what is a key consideration to enhance their effectiveness, besides using authentic materials?
To determine whether 'bridge' is used more often in a literal or figurative sense, one can use the Word Sketch to examine the most frequent ______ of 'bridge'.
To determine whether 'bridge' is used more often in a literal or figurative sense, one can use the Word Sketch to examine the most frequent ______ of 'bridge'.
Match the Sketch Engine tool with its primary function for analyzing word usage:
Match the Sketch Engine tool with its primary function for analyzing word usage:
What type of query in the Concordance tool allows you to identify verbs typically used with the plural noun 'bridges'?
What type of query in the Concordance tool allows you to identify verbs typically used with the plural noun 'bridges'?
The use of signal words should be emphasized when creating a worksheet.
The use of signal words should be emphasized when creating a worksheet.
Besides identifying literal or figurative usage, what other distinction can the Word Sketch immediately show about a word’s usage?
Besides identifying literal or figurative usage, what other distinction can the Word Sketch immediately show about a word’s usage?
Flashcards
Collocations
Collocations
Words that commonly appear together.
Language in Use
Language in Use
Corpus linguistics examines language as it is used, ignoring strict prescriptive rules.
Frequency Counts
Frequency Counts
Counting word frequencies to see which words are most common.
Concordancing
Concordancing
Signup and view all the flashcards
Pattern Identification
Pattern Identification
Signup and view all the flashcards
Corpus
Corpus
Signup and view all the flashcards
Authenticity in Corpora
Authenticity in Corpora
Signup and view all the flashcards
Corpus Comparison
Corpus Comparison
Signup and view all the flashcards
Corpus Linguistics
Corpus Linguistics
Signup and view all the flashcards
Phraseology
Phraseology
Signup and view all the flashcards
Corpus Linguistics (expanded)
Corpus Linguistics (expanded)
Signup and view all the flashcards
Lexicogrammar
Lexicogrammar
Signup and view all the flashcards
Discourses in a Corpus
Discourses in a Corpus
Signup and view all the flashcards
Corpus Definition
Corpus Definition
Signup and view all the flashcards
Authentic Conversations
Authentic Conversations
Signup and view all the flashcards
Communicative Setting
Communicative Setting
Signup and view all the flashcards
Concordance
Concordance
Signup and view all the flashcards
Sketch Engine 'Concordance' tool function
Sketch Engine 'Concordance' tool function
Signup and view all the flashcards
KWIC
KWIC
Signup and view all the flashcards
Simple Query (Concordance)
Simple Query (Concordance)
Signup and view all the flashcards
Lemma Query (Concordance)
Lemma Query (Concordance)
Signup and view all the flashcards
Phrase Query (Concordance)
Phrase Query (Concordance)
Signup and view all the flashcards
Word Query (Concordance)
Word Query (Concordance)
Signup and view all the flashcards
Character Query (Concordance)
Character Query (Concordance)
Signup and view all the flashcards
Wildcard Use in Concordance
Wildcard Use in Concordance
Signup and view all the flashcards
Subcorpus
Subcorpus
Signup and view all the flashcards
Macro (in Corpus Analysis)
Macro (in Corpus Analysis)
Signup and view all the flashcards
Filter Context
Filter Context
Signup and view all the flashcards
Text Types (in Corpus)
Text Types (in Corpus)
Signup and view all the flashcards
Tags (POS, Morphological)
Tags (POS, Morphological)
Signup and view all the flashcards
Wordlist Tool
Wordlist Tool
Signup and view all the flashcards
Compiling Word Frequency
Compiling Word Frequency
Signup and view all the flashcards
Phraseological Items
Phraseological Items
Signup and view all the flashcards
Lexico-grammatical Approach
Lexico-grammatical Approach
Signup and view all the flashcards
Corpus Examples: 'smile' & 'love'
Corpus Examples: 'smile' & 'love'
Signup and view all the flashcards
Usage-Based Approaches
Usage-Based Approaches
Signup and view all the flashcards
Integrating Corpus Use
Integrating Corpus Use
Signup and view all the flashcards
Benefits of Authentic Language
Benefits of Authentic Language
Signup and view all the flashcards
Prototypical Construction to Abstraction
Prototypical Construction to Abstraction
Signup and view all the flashcards
Continuum between lexicon and gramma
Continuum between lexicon and gramma
Signup and view all the flashcards
Word Sketch Difference
Word Sketch Difference
Signup and view all the flashcards
Concordance Tool
Concordance Tool
Signup and view all the flashcards
Authentic Material Adaptation
Authentic Material Adaptation
Signup and view all the flashcards
Literal vs. Figurative Usage
Literal vs. Figurative Usage
Signup and view all the flashcards
Word Sketch Simple Search
Word Sketch Simple Search
Signup and view all the flashcards
Part-of-Speech Distinction
Part-of-Speech Distinction
Signup and view all the flashcards
Lexical Variability
Lexical Variability
Signup and view all the flashcards
Advanced Search by POS
Advanced Search by POS
Signup and view all the flashcards
Study Notes
Corpus Definition
- It is a large, principled collection of naturally occurring language, usually stored electronically.
- A large collection/database of machine-readable texts involving natural discourse in diverse contexts.
- Discourses can be spoken, written, computer-mediated, spontaneous, or scripted.
- It represents a variety of genres such as everyday conversations, lectures, seminars, meetings, radio/television programs, and essays.
- Texts must have been produced in a natural communicative setting.
- Texts are spoken/written for some authentic communicative purpose.
Corpus Linguistics Definition
- It's the study of language in use through corpus analysis.
- A whole system of methods and principles of how to apply corpora in language studies and teaching/learning.
General/Generalized Corpora Definition
- These are very large corpora aiming to represent language as a whole.
- They often contain more than 10 million words.
- They encompass a variety of language.
- Findings from it are somewhat generalized.
- Examples include:
- British National Corpus (BNC)
- American National Corpus (ANC)
- COCA
- Written texts (newspaper/magazine articles); works of fiction/nonfiction.
- Writings from scholarly journals.
- Spoken transcripts (informal conversations, government proceedings, business meetings)
Specialized Corpora Definition
- This represents a particular part of a language.
- Represents language of a particular subject field or dialect.
- It contains texts of a certain type.
- It aims to be representative of the language of this type.
- It can be large/small.
- Often created to answer very specific questions.
- Examples include:
- Michigan Corpus of Academic Spoken English (MICASE)
- CHILDES Corpus (only language used by children)
- Michigan Corpus of Upperlevel Student Papers
- Medical corpus (language used by nurses/hospital staff)
Learner Corpora Definition
- A specialized corpus containing written texts and/or spoken transcripts of language used by students.
- Students are currently acquiring the language.
- It can be examined to see common errors students made.
Pedagogic Corpora Definition
- A corpus that contains language used in classroom settings.
- Pedagogic corpora can include academic textbooks and transcripts of classroom interactions.
- It also includes any other written text/spoken transcript that learners encounter in an educational setting.
- These can be used to:
- Ensure students are learning useful language.
- Examine teacher-student dynamics
- Act as a self-reflective tool for teacher development
Data-Driven Learning (DDL) Definition
- An approach to foreign language learning.
- Most language learning is guided by teachers/textbooks.
- It treats language as data and students as researchers undertaking guided discovery tasks.
"Principled" in Corpus Compilation/Linguistics Definition
- A corpus follows different principles/guidelines, which can differ depending on the corpus.
- The texts that go into the corpus must be planned.
- The language comprised cannot be random.
- It must be chosen according to specific characteristics.
- Texts must be chosen so that they are useful for your research question/the aim of your corpus.
- Depending on what you want to do with your corpus you have to choose your texts.
Basic Principles of Corpus Linguistics
- Context: Without context words can be completely misinterpreted
- Words need other words to convey meaning and isolated words do not carry meaning
- "Rose" can refer to the flower or past tense of "rise."
- Language patterns: Collocations are words that usually co-occur together.
- Patterned nature becomes apparent through corpus analysis
- Without one part of the collocation, the whole collocation does not make any sense anymore.
- It is not "make homework" but "do homework".
- It also is not "do a mistake" but "make a mistake."
- Tense and aspect: Some verbs only stand with a specific tense or aspect.
- "Love" can hardly ever be found with the continuous form.
- Language in Use: Corpus linguistics studies language in use and is not concerned with prescriptive "rules."
- Some expressions/utterances may be strictly speaking incorrect sentences with missing subjects
- Those examples must be considered as well in the corpus.
- Language in use might not be the one expected/what individuals learned about English.
Common Corpus Tools
- Frequency counts (single words and multiword units)
- Concordancing
- Pattern identification (e.g., collocation search)
- Corpus comparison (comparison of different corpora or parts of different corpora/the same corpus)
- Compare language varieties (regional varieties, language register)
- Compare general and specialized corpora to identify corpus specific terminology
- Translation (multi-language corpora, e.g. linguee)
Research Fields and Applications for Corpus Linguistics
- More theoretical:
- Phraseology (study of multi-word units)
- Lexicogrammar: connection between lexical and grammatical aspects
- Register, language change
- More practical:
- (Foreign) language teaching
- Lexicography
- Translation (studies) (not important for the exam)
- Language for specific purposes
- Writing assistance/language reference
Authentic and Naturally Occurring Language
- Every utterance (be it spoken, written or transcribed) that has been produced for communication and not for the purpose of being put into a corpus
- Language that has been produced in natural communicative settings.
- Authenticity also implies that researchers want language material produced by native speakers
- Not all corpora have to be like that
- Some corpora look at the language of English learners
Corpus Access
- Download corpus + download corpus software
- Download corpus + access corpus software online
- Access corpus + corpus software online
Corpus Compilation
- Collect texts in accordance with purpose of corpus/research interests
- Use texts from books, newspapers, journals, transcriptions, texts written by language learners, etc.
- Use texts from the web.
Useful Corpora for Language Teaching and Learning
- Use online corpus tools like the Sketch Engine or corpus tools you can download to your computer
- General corpora for working on lexical and grammar skills include:
- BNC
- enTenTen
- English Corpus for SkELL
- Open American National Corpus
- Specialized corpora include:
- For working on current topics or for bilingual subject teaching
- Brexit corpus
- ScienceBlogs
- Environment corpus
- e-flux
- EcoLexicon English Corpus
- Literary corpora:
- For working on literary skills
- Project Gutenberg English
- English Drama Corpus
- Shakespeare English Drama Corpus
Sketch Engine and LancsBox Differences
- Sketch Engine:
- Has online accessibility (servers in CZ)
- Around 700 pre-loaded corpora
- Supports more than 100 languages
- Has free access (no word limits for own corpora)
- Can build corpora from own files and web sources (websites, URLs, and seed words)
- Usability only requires basic knowledge of corpus usage
- LancsBox:
- Is requires local storage on device
- Contains 7 pre-loaded corpora
- Only supports 2 languages
- Costs around 100€ per year (including up to 1 million words for own corpora)
- Can build corpora from own files and web sources (websites and URLs but NO seed words)
- Benefits from search option require sound knowledge of corpus linguistics (and statistics)
Sketch Engine "Concordance" Tool
- Collection of word-form occurrences in own textual environment.
- Provides a word-form index and reference to occurrence place in text.
- Helps in finding words, phrases, tags, documents, text types, or corpus structures.
- Displays results in context in the form of concordance.
- Concordance can be sorted, filtered, counted, and processed further.
- It obtains the desired result.
- One of most powerful tools.
- Tediousness to analyze and interpret may arise when used with large corpora.
"Concordance" Tool Query Types
- KWIC=Keywords in context - concordance lines
- Simple: Automatic search for all forms of a word with base form
- Lemma: Finds all word forms of the lemma (base form)
- Phrase: Finds a phrase composed of several tokens (words) exactly as typed; no other word forms included
- Word: Finds a word form exactly as typed
- Character: Finds tokens (words) which contain the character(s)
- looks for the actual punctuation
- CQL:Uses Corpus Query Language for complex criteria
- makes use of part-of-speech tags and regular expressions
- collection of special symbols to search for patterns, instead of specific characters
- makes use of part-of-speech tags and regular expressions
"Concordance" Additional Search Functions
- Subcorpus: Limits search to certain parts of the divided corpus
- genre legal/news/blog/fiction
- topic science/religion/sports/politics/tourism
- domain .uk/.us/.au
- Macro: Saves same-criteria searches
- Filter context: Keeps lines fulfilling additional conditions
- Text types: Helps exclude or include specific documents or parts of corpus
Tags
- Also called part-of-speech tag, POS tag or morphological tag.
- Assigned to token in an annotated corpus.
- Indicates the part of speech and often also grammatical categories and morphological information.
Sketch Engine “Wordlist” Tool
- Used to generate frequency lists of all kinds:
- Lists of words
- Lemmas
- Nouns, verbs, tags
- Words containing/not containing certain characters
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
Explore corpus linguistics: methodologies, verb forms, corpus comparison, and multi-word units. Learn about tool functions, deviations from grammar, collocations, and applications in lexicogrammar. Also, study concordance tool search functions and the use of macros in corpus analysis.