Pseudowords & Semantics

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

In computational linguistics, what primary role do pseudowords play in research?

They enable controlled experiments by removing prior meaning associations. (correct)
They serve as replacements for outdated vocabulary.
They are used to test the processing speed of native speakers.
They help in understanding the historical evolution of languages.

What is the main goal when using the 'Wuggy' algorithm to create pseudowords?

To ensure the created pseudowords follow the phonotactic constraints of a given language. (correct)
To create words that are as different as possible from real words.
To generate words that have clear emotional valence.
To produce words that are universally pronounceable across all languages.

Why is it important for computational models to account for the emotional valence of words?

To optimize search engine rankings based on sentiment.
To better predict stock market fluctuations.
To simulate human-like understanding and processing of language nuances. (correct)
To improve the accuracy of machine translation between languages.

What is the primary function of 'edit distance' in computational linguistics?

To quantify the number of edits required to transform one word into another. (C) Signup and view all the answers

How does the 'Systematicity Hypothesis' explain the relationship between word forms and their meanings?

Word forms predict word meaning, such that similar-sounding words have similar meanings. (D) Signup and view all the answers

In the context of word embeddings, what is the significance of representing words as vectors?

It enables mathematical operations to determine semantic relationships between words. (C) Signup and view all the answers

What key advantage does the FastText model offer over Word2Vec when handling language data?

FastText uses subword information, enhancing its ability to handle rare words and morphological variations. (D) Signup and view all the answers

When evaluating word embeddings, what does 'intrinsic evaluation' primarily assess?

The alignment of embeddings with human intuition about word relationships. (C) Signup and view all the answers

According to the experiment by Gatti et al. (2024), what can be inferred about how humans process the valence of novel vs. real words?

Humans rely more on letter n-grams to predict the valence of novel words compared to real words. (A) Signup and view all the answers

What is the main purpose of using Pointwise Mutual Information (PMI) when improving word representations?

To give higher weight to words that co-occur more than expected by chance. (C) Signup and view all the answers

When discussing N-gram language models, what is a major limitation related to 'data sparsity'?

Long sequences of words may rarely appear in the training data. (C) Signup and view all the answers

What is the purpose of using an `<UNK>` token in handling unseen words in language modeling?

To replace rare words with a generic token, allowing the model to estimate probabilities for unseen words. (B) Signup and view all the answers

What key advantage do neural language models offer compared to N-gram models?

Neural language models can capture longer-term dependencies in text. (C) Signup and view all the answers

What is the main function of the 'self-attention' mechanism in transformer models?

To allow each word to attend to all other words in the sequence, capturing relationships regardless of distance. (C) Signup and view all the answers

What problem do contextualized word embeddings solve that fixed word embeddings do not?

They allow word meaning to vary based on context. (A) Signup and view all the answers

Why is positional encoding necessary in transformer models?

To provide information about the position of words in a sentence, as transformers process all words at once. (D) Signup and view all the answers

What is a key motivation behind using subword tokenization in modern NLP models?

To improve the model's ability to generalize and handle rare or unseen words. (D) Signup and view all the answers

What is a key difference in word processing between humans and transformer models?

Humans leverage real-world knowledge, while transformers primarily use text-based information. (A) Signup and view all the answers

What aspect of language does the Jaccard distance primarily measure?

Contextual overlap (A) Signup and view all the answers

Which of the following describes 'lexicon' as defined in the text?

A resource that matches words or expressions (A) Signup and view all the answers

What does the distributional hypothesis propose about word meaning?

Word meanings are influenced by the contexts in which they occur. (C) Signup and view all the answers

Why is 'valence' considered an important feature in computational linguistics?

It helps models understand sentiment. (B) Signup and view all the answers

What is a drawback of using trigram encoding in language models?

It leads to greater sparsity. (D) Signup and view all the answers

What is the purpose of text preprocessing in NLP before training a model?

To convert text into a suitable format for the model. (D) Signup and view all the answers

How do language models use probabilities?

They assign likelihood of predicting words. (C) Signup and view all the answers

Flashcards

Pseudo Words

Words that, through repeated use, gain understanding and acceptance, effectively becoming real words.

Valence

A measure of how positive or negative the meaning conveyed by a word is.