Podcast
Questions and Answers
What is the purpose of the learn a feature vector in neural networks for language modeling?
What is the purpose of the learn a feature vector in neural networks for language modeling?
To represent the similarity between words
In the context of Continuous Bag of Words (CBOW), how are the input words used to predict the output word?
In the context of Continuous Bag of Words (CBOW), how are the input words used to predict the output word?
By summing the rows of every input word and finding the most similar column in the output.
What is the main idea behind Skip-Gram with Negative Sampling (SGNS) in neural networks for language modeling?
What is the main idea behind Skip-Gram with Negative Sampling (SGNS) in neural networks for language modeling?
To use neighbor words to predict the target word.
How can we interpret the relationship between words in skip-gram model?
How can we interpret the relationship between words in skip-gram model?
What is the loss function used in Continuous Bag of Words (CBOW) for word2vec?
What is the loss function used in Continuous Bag of Words (CBOW) for word2vec?
What is the loss function used in Skip-Gram for word2vec?
What is the loss function used in Skip-Gram for word2vec?
What are two methods to approximate softmax for performance reasons?
What are two methods to approximate softmax for performance reasons?
How can we optimize the weights in Skip-Gram model?
How can we optimize the weights in Skip-Gram model?
In Skip-Gram model, what is updated with a learning rate during optimization?
In Skip-Gram model, what is updated with a learning rate during optimization?
What is the purpose of negative sampling in Skip-Gram optimization?
What is the purpose of negative sampling in Skip-Gram optimization?
What is the objective of skip-gram model when all word cooccurrences are aggregated into a matrix?
What is the objective of skip-gram model when all word cooccurrences are aggregated into a matrix?
In the context of word2vec to Paragraph Vectors, what is the suggested approach to improve representation?
In the context of word2vec to Paragraph Vectors, what is the suggested approach to improve representation?
Flashcards are hidden until you start studying