Advanced Neural Networks & NLP PDF

Advanced Neural Networks & NLP Exercise 9 Agenda Convolution Neural Networks ○ Convolutional layer ○ Pooling layer ○ Fully connected layer Text Classiﬁcation using CNN Long Short Term Memory (LSTM) Word2Vec Embedding Named Entity Recognition Convolutions don’t use fully connected layers, but sparsely connected layers, that is, they Convolution Neural accept matrices as inputs Unlike human beings, who understand images Networks by taking snapshots with the eye The layers used in CNNs are called: ○ Convolutional layer ○ Pooling layer ○ Fully connected layer Convolutional layer The convolution layer (CONV) uses ﬁlters that perform convolution operations while scanning the input image with respect to its dimensions. Its hyperparameters include the ﬁlter size, which can be 2x2, 3x3, 4x4, 5x5 (but not restricted to these alone), and stride (S). The resulting output (O) is called the feature map or activation map and has all the features computed using the input layers and ﬁlters. Pooling layer The pooling layer (POOL) is used for downsampling of the features and is typically applied after a convolution layer. The two types of pooling operations are called max and average pooling, where the maximum and average value of features is taken, respectively. Fully connected layer The fully connected layer (FC) operates on a ﬂattened input where each input is connected to all the neurons. These are usually used at the end of the network to connect the hidden layers to the output layer, which helps in optimizing the class scores. Text Classiﬁcation using CNN LSTMs on the other hand, make small modiﬁcations to the information by multiplications and additions. With LSTMs, the information ﬂows through a mechanism known as cell states. This way, Long Short Term LSTMs can selectively remember or forget things. The information at a particular cell state Memory (LSTM) has three different dependencies. These dependencies can be generalized to any problem as: ○ The previous cell state (i.e. the information that was present in the memory after the previous time step) ○ The previous hidden state (i.e. this is the same as the output of the previous cell) ○ The input at the current time step (i.e. the new information that is being fed in at that moment) Long Short Term Memory (LSTM) Word2Vec was developed at Google by Tomas Mikolov, et al. and uses Neural Networks to learn word embeddings The beauty with word2vec is that the vectors are learned by understanding the context in Word2Vec Embedding which words appear. The result is vectors in which words with similar meanings end up with a similar numerical representation. Word2Vec is composed of two different learning models, CBOW and Skip-Gram. CBOW stands for Continuous Bag of Words model. Skip-Gram Model is the opposite, learning word embeddings by training a model to predict context given a word. Word2Vec Embedding In Natural language processing, Named Entity Recognition (NER) is a process where a Named Entity sentence or a chunk of text is parsed through to ﬁnd entities that can be put under categories Recognition like names, organizations, locations, quantities, monetary values, percentages, etc.

Advanced Neural Networks & NLP PDF

Document Details

Tags

Related

Summary

Full Transcript