Podcast
Questions and Answers
What does the language model represent?
What does the language model represent?
- P(w1,w2…wn)
- P(w1,w2…wn|wn)
- P(wn|w1,w2…wn) (correct)
- P(wn|w1)
What is the formula for the joint probability of multiple variables using the Chain Rule?
What is the formula for the joint probability of multiple variables using the Chain Rule?
- P(A,B,C,D) = P(A)P(B|A)P(C|A,B)P(D|A,B,C) (correct)
- P(A,B,C,D) = P(A,B)P(C,D)
- P(A,B,C,D) = P(A)P(B,C)P(D)
- P(A,B,C,D) = P(A)P(B)P(C)P(D)
How is the probability of a sentence computed using the Chain Rule?
How is the probability of a sentence computed using the Chain Rule?
- P(its, water, is, so, transparent) = P(its)P(water)P(is)P(so)P(transparent)
- P(its, water, is, so, transparent) = P(its,water)P(is,so)P(transparent)
- P(its, water, is, so, transparent) = P(its)P(water|its)P(is|its,water)P(so|its,water,is)P(transparent|its,water,is,so) (correct)
- P(its, water,CommandEvent, so, transparent) = P(its)P(water,is)P(so,transparent)
Why can't we simply count and divide to estimate the probabilities of words in a sentence?
Why can't we simply count and divide to estimate the probabilities of words in a sentence?
What is the formula for estimating the probability of a word given the previous words using the Chain Rule?
What is the formula for estimating the probability of a word given the previous words using the Chain Rule?
What is the purpose of the Chain Rule in language modeling?
What is the purpose of the Chain Rule in language modeling?
How is the probability of a sentence P(W) computed in language modeling?
How is the probability of a sentence P(W) computed in language modeling?
What is the advantage of using the Chain Rule in language modeling?
What is the advantage of using the Chain Rule in language modeling?
What is the formula for the conditional probability p(B|A)?
What is the formula for the conditional probability p(B|A)?
What is the purpose of language modeling in NLP?
What is the purpose of language modeling in NLP?