Podcast
Questions and Answers
What is the purpose of the learn a feature vector in neural networks for language modeling?
What is the purpose of the learn a feature vector in neural networks for language modeling?
To represent the similarity between words
In the context of Continuous Bag of Words (CBOW), how are the input words used to predict the output word?
In the context of Continuous Bag of Words (CBOW), how are the input words used to predict the output word?
By summing the rows of every input word and finding the most similar column in the output.
What is the main idea behind Skip-Gram with Negative Sampling (SGNS) in neural networks for language modeling?
What is the main idea behind Skip-Gram with Negative Sampling (SGNS) in neural networks for language modeling?
To use neighbor words to predict the target word.
How can we interpret the relationship between words in skip-gram model?
How can we interpret the relationship between words in skip-gram model?
Signup and view all the answers
What is the loss function used in Continuous Bag of Words (CBOW) for word2vec?
What is the loss function used in Continuous Bag of Words (CBOW) for word2vec?
Signup and view all the answers
What is the loss function used in Skip-Gram for word2vec?
What is the loss function used in Skip-Gram for word2vec?
Signup and view all the answers
What are two methods to approximate softmax for performance reasons?
What are two methods to approximate softmax for performance reasons?
Signup and view all the answers
How can we optimize the weights in Skip-Gram model?
How can we optimize the weights in Skip-Gram model?
Signup and view all the answers
In Skip-Gram model, what is updated with a learning rate during optimization?
In Skip-Gram model, what is updated with a learning rate during optimization?
Signup and view all the answers
What is the purpose of negative sampling in Skip-Gram optimization?
What is the purpose of negative sampling in Skip-Gram optimization?
Signup and view all the answers
What is the objective of skip-gram model when all word cooccurrences are aggregated into a matrix?
What is the objective of skip-gram model when all word cooccurrences are aggregated into a matrix?
Signup and view all the answers
In the context of word2vec to Paragraph Vectors, what is the suggested approach to improve representation?
In the context of word2vec to Paragraph Vectors, what is the suggested approach to improve representation?
Signup and view all the answers