Podcast
Questions and Answers
Select all that apply to overfitting. (Zero, one, or more choices can be correct.)
Select all that apply to overfitting. (Zero, one, or more choices can be correct.)
- Overfitting is when the model is too simple and cannot learn the characteristics of the training data.
- Overfitting is when the model learns the characteristics of the training data so much that the model worsens its performance (correct)
- Overfitting is when the model has too much regularization.
- Overfitting is when the model learns the noise of the training data, hence the model cannot generalize. (correct)
Given a sequence dataset with a vocabulary with V possible words, how many n-grams (of length n) could in theory be observed?
Given a sequence dataset with a vocabulary with V possible words, how many n-grams (of length n) could in theory be observed?
- V*n
- V^n (correct)
- V
- n!
- n
- V choose n
Given a sequence of length L, how many total n-grams are observed?
Given a sequence of length L, how many total n-grams are observed?
- L^n
- L*n
- L- (n - 1) (correct)
- L * V^n
Given a dataset of N sequences, and total number of tokens T, we insert a special token <S> at the beginning of each sequence, and </S> at the end. How many total ngrams are observed in this dataset?
Given a dataset of N sequences, and total number of tokens T, we insert a special token <S> at the beginning of each sequence, and </S> at the end. How many total ngrams are observed in this dataset?
How many image patches of size (m x n) can be constructed from an image of size (W x H) if we form patches centered at a pixel in the image (using padding at the edges)?
How many image patches of size (m x n) can be constructed from an image of size (W x H) if we form patches centered at a pixel in the image (using padding at the edges)?
We are trying to represent a conversation tree. The vocabulary has V possible words, and we want pairwise features of bigrams. An example feature would count when the bigram "do_you" appears in a parent message and “i_do” appears in a reply to that message. How many such features could, in theory, be observed?
We are trying to represent a conversation tree. The vocabulary has V possible words, and we want pairwise features of bigrams. An example feature would count when the bigram "do_you" appears in a parent message and “i_do” appears in a reply to that message. How many such features could, in theory, be observed?
Consider a particular edge in this conversation tree: the parent has A tokens and the reply has B tokens. How many pairwise bigram features (constructed as in the previous question) are observed?
Consider a particular edge in this conversation tree: the parent has A tokens and the reply has B tokens. How many pairwise bigram features (constructed as in the previous question) are observed?
When using a linear model, can we encode a sequence (text/dna) just as a vector of "word" indices e.g. [42, 11, 2, 7, 11, 2] with no padding? Answer with 'yes' or 'no' and explain why. Hint: consider two different sequences.
When using a linear model, can we encode a sequence (text/dna) just as a vector of "word" indices e.g. [42, 11, 2, 7, 11, 2] with no padding? Answer with 'yes' or 'no' and explain why. Hint: consider two different sequences.
Flashcards
Overfitting
Overfitting
A model that learns the training data too well, potentially memorizing noise and failing to generalize to unseen data.
Underfitting
Underfitting
The model is too simple to learn the underlying patterns in the data, resulting in poor performance.
Total possible n-grams
Total possible n-grams
The number of possible n-grams (sequences of length n) that can be formed from a vocabulary of V words.
Total n-grams in a sequence
Total n-grams in a sequence
Signup and view all the flashcards
Total ngrams in a dataset
Total ngrams in a dataset
Signup and view all the flashcards
Image patches
Image patches
Signup and view all the flashcards
Pairwise bigram features in conversation tree
Pairwise bigram features in conversation tree
Signup and view all the flashcards
Pairwise bigram features in a conversation edge
Pairwise bigram features in a conversation edge
Signup and view all the flashcards
Can a sequence be encoded as a vector of word indices without padding for a linear model?
Can a sequence be encoded as a vector of word indices without padding for a linear model?
Signup and view all the flashcards
Study Notes
Quiz 2 - Study Notes
-
Question 1: Overfitting
- Overfitting occurs when a model learns the training data's characteristics excessively, leading to poor performance on new data.
- This happens when the model is overly complex, fitting the noise in the data.
- Overfitting contrasts with a model that is too simple, failing to capture the nuances.
- Overfitting also contradicts a model with sufficient regularization.
-
Question 2: N-grams
- Given a vocabulary of V words, the number of possible n-grams (sequences of length n) is Vn.
-
Question 3: Total N-grams
- In a sequence of length L, the total number of n-grams is L - (n-1).
-
Question 4: N-grams in a Dataset of Sequences
- In a dataset of N sequences, each of length T and adding beginning and end special tokens, the total number of n-grams is T + N * (3 - n).
-
Question 5: Image Patches
- The number of (m x n) patches that can be extracted from a (W x H) image, centered at a pixel with padding, is (W - m + 1) * (H - n + 1).
-
Question 6: Bigram Features in Conversation Tree
- If there's a vocabulary of V words, the total number of bigram features that can be observed is V2.
-
Question 7: Bigram Features in a Conversation Tree Edge
- With A tokens in the parent and B tokens in the reply, the number of pairwise bigram features is (A - 1) * (B - 1).
-
Question 8: Sequence Encoding with Linear Model
- A sequence (e.g., text or DNA) can be encoded as a vector of word indices without padding. This is a valid approach when using a linear model. Different sequences with the same words would yield different vectors.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
Test your understanding of key machine learning concepts such as overfitting, n-grams, and image patches. This quiz covers fundamental principles that are crucial for developing robust models. Perfect for students looking to reinforce their learning in machine learning.