Mastering Attention Mechanisms in Sequential Decoders
5 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which of the following best describes the purpose of attention in a sequential decoder?

  • To alleviate the vanishing gradient problem
  • To compute the alignment model f
  • To focus on the most relevant parts of the input sequence for each output (correct)
  • To compute the context vector c
  • What is the formula for computing the attention score αᵢⱼ in the context of attention?

  • αᵢⱼ = softmax(eⱼ) (correct)
  • αᵢⱼ = softmax(f(i, j))
  • αᵢⱼ = softmax(e)
  • αᵢⱼ = softmax(hⱼ)
  • What does the alignment model f in the context of attention represent?

  • The amount of attention the ith output should pay to the jth input
  • The scores of how well the inputs around position j and the output at position i match (correct)
  • The hidden state from the previous timestep
  • The encoder state for the jth input
  • How can the alignment model f be approximated?

    <p>By using a small neural network</p> Signup and view all the answers

    What is the purpose of the context vector c in the context of attention?

    <p>To compute the attention scores αᵢⱼ</p> Signup and view all the answers

    More Like This

    Use Quizgecko on...
    Browser
    Browser