12 - pLSI
12 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is the joint probability if variables A and B both depend on C?

P(A, B, C) = P(A | C) * P(B | C) * P(C)

What does the notation 'Plates' serve as in graphical models?

Shorthand notion for repeated elements

In Probabilistic Latent Semantic Indexing (pLSI), what does theta represent?

Topic distribution of document

What does the EM algorithm aim to find in the context of pLSI?

<p>The most likely parameters</p> Signup and view all the answers

What is the iterative procedure called that optimizes a statistical model?

<p>Expectation-Maximization (EM)</p> Signup and view all the answers

What is the first step in the Expectation-Maximization (EM) algorithm?

<p>Choose initial model parameters</p> Signup and view all the answers

What is the purpose of the Expectation-Maximization algorithm?

<p>Improve the log likelihood by both the E-step and the M-step.</p> Signup and view all the answers

How does the Probabilistic Generative Model assume data generation?

<ol> <li>Sample a word distribution for every topic. 2. Sample a topic distribution for every document. 3. Generate words for every document by sampling a topic and then a word from the distribution.</li> </ol> Signup and view all the answers

In PLSI, how is a document modeled?

<p>As mixtures of topics.</p> Signup and view all the answers

What is the main idea behind the EM-Algorithm for pLSI?

<p>Estimation-Step (estimate) and Maximization-Step (optimize) iteratively to find Maximum-Likelihood Estimation (MLE).</p> Signup and view all the answers

What are some ways to incorporate prior knowledge in PLSI?

<p>Remove stopwords, add a 'background' topic, or use maximum-a-posterior (MAP) estimation with a prior word distribution.</p> Signup and view all the answers

Why does incorporating all words, including stopwords, in PLSI cause problems?

<p>Because stopwords can introduce noise and irrelevant information.</p> Signup and view all the answers

Use Quizgecko on...
Browser
Browser