17 Questions
What does the attention decoder RNN do with the output it produces?
Discards it
What happens during the Attention Step?
A context vector is calculated for the current time step
What is the role of the feedforward neural network in the decoding process?
Generates the output word at each time step
How does the model handle varying length outputs?
By dynamically adjusting the hidden state dimensionality
In what way does the model show limited context awareness?
By only considering the most recent hidden state
In an attention mechanism, what does the decoder do differently compared to a classic seq-to-seq model?
Receive all the hidden states from the encoder instead of just the last one
What is one key advantage of using an attention mechanism over models without attention?
Better handling of varying lengths in inputs and outputs
How does an attention decoder decide which parts of the input are relevant to the decoding time step?
By looking at the set of encoder hidden states and assigning scores
Why does an attention model pass all the hidden states from the encoder to the decoder?
To allow selective focus on different parts of the input sequence
What does an attention model utilize to amplify hidden states with high relevance scores?
Softmaxed score multiplication
What is a primary role of the attention mechanism in a seq-to-seq model?
To enable selective focus on different parts of the input sequence
What issue can arise with traditional seq2seq models due to their lack of alignment?
Difficulty generating accurate translations
Why can the context vector be considered a bottleneck for Seq2Seq models?
It makes it challenging for the models to deal with long sentences
What problem can arise from traditional seq2seq models not being able to focus on specific parts of the input sequence during decoding?
Difficulty generating accurate translations or outputs
How do attention mechanisms help improve traditional seq2seq models?
By enabling the model to focus on relevant parts of the input sequence
Why are attention-based seq2seq models considered more effective for tasks requiring a high level of context awareness?
As they allow selective focus on certain parts of the input sequence
What is a common issue faced by traditional seq2seq models when handling complex input sequences?
Difficulty processing long or complex sentences
Learn about the limitations of traditional sequence-to-sequence models in NLP tasks, particularly in generating accurate translations or outputs due to a lack of context awareness during decoding. Understand how these limitations impact tasks like language translation and question answering.
Make Your Own Quizzes and Flashcards
Convert your notes into interactive study material.
Get started for free