Mastering Attention Mechanisms in Sequential Decoders
5 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which of the following best describes the purpose of attention in a sequential decoder?

  • To alleviate the vanishing gradient problem
  • To compute the alignment model f
  • To focus on the most relevant parts of the input sequence for each output (correct)
  • To compute the context vector c

What is the formula for computing the attention score αᵢⱼ in the context of attention?

  • αᵢⱼ = softmax(eâ±¼) (correct)
  • αᵢⱼ = softmax(f(i, j))
  • αᵢⱼ = softmax(e)
  • αᵢⱼ = softmax(hâ±¼)

What does the alignment model f in the context of attention represent?

  • The amount of attention the ith output should pay to the jth input
  • The scores of how well the inputs around position j and the output at position i match (correct)
  • The hidden state from the previous timestep
  • The encoder state for the jth input

How can the alignment model f be approximated?

<p>By using a small neural network (B)</p> Signup and view all the answers

What is the purpose of the context vector c in the context of attention?

<p>To compute the attention scores αᵢⱼ (B)</p> Signup and view all the answers

More Like This

Use Quizgecko on...
Browser
Browser