30 Questions
What defines 'Deep' in the context of neural networks?
Having more than two layers in an architecture
In Deep Learning, what is the main benefit of using multiple layers?
Ability to represent complex non-linear functions compactly
Which theorem states that an artificial neural network with a single hidden layer can approximate any continuous function on compact subsets of Rn with infinite neurons?
Universal approximation theorem
Why might visualization become difficult in a shallow network when dealing with unstructured data like an image dataset?
Lack of hidden layers for feature extraction
What is a potential issue with using a very large number of neurons in just one hidden layer of a neural network?
Overfitting
How does having a single hidden layer network with a large number of neurons lead to 0% training error?
By memorizing the training data
Which type of problem can benefit from using deep neural networks instead of shallow networks?
High-dimensional unstructured data problems
What role do hidden layers play in deep learning neural networks?
Extract and transform features hierarchically
What is one of the issues faced when using more layers in deep networks?
Vanishing gradients
Why does a (k−1)-layer network require an exponentially large number of hidden units to represent a function that a k-layer network can represent compactly?
To achieve the same level of representational capacity
What is one of the central problems in Machine Learning according to the text?
Overfitting
Why does gradient descent become very inefficient in deeper networks?
Complex structures in the parameter space being optimized
What happens to the gradients as more hidden layers are added to a deep network?
They vanish and become small relative to the weights
Why does adding more layers to a network not always lead to better performance?
Because vanishing gradients occur
What is a major issue with deep learning highlighted in the text?
Vanishing gradients
What is the derivative of a sigmoid function highlighted as in the text?
Less than 0.3
What percentage of training accuracy is highlighted in the text?
100%
Which issue makes optimization difficult for deep learning networks?
Vanishing gradients
What type of points are mentioned as problematic in high-dimensional spaces?
Saddle points
What combination is described as deadly for neural networks in the text?
Vanishing gradients with overfitting to local minima or saddle points
Why is finding a minimum over the surface difficult in deep learning optimization?
Due to the presence of saddle points and steep gradients
What is a challenge that arises from each sigmoid derivative being less than 0.3?
Vanishing gradients
What is the purpose of back-propagation in Artificial Neural Networks (ANN)?
To compute the sensitivity of the error to the change in weights
What does the term 'Back-propagation' stand for in Artificial Neural Networks?
Backward propagation of Errors
Who popularized the Back-propagation method in Neural Networks?
Geoffery Hinton
What problem can occur due to vanishing gradients in a Neural Network with many layers?
There will be no learning because back-propagation error to initial layers is almost zero
In simple cases, what action should be taken if the output of an Artificial Neural Network is too high?
Decrease the weights attached to high inputs
What does ANN stand for in the context of Artificial Neural Networks?
Artificial Neural Network
'Backpropagation' led to a renaissance in which area?
Neural Networks
'𝑤𝑡+1 𝜕𝐸 = 𝑤𝑡 − 𝛼 𝜕𝑊' represents which concept in Artificial Neural Networks?
Weight adjustment based on error sensitivity
Learn about different learning methods for Artificial Neural Networks, including changing weights and back-propagation. Understand how to adjust weights based on the output and improve performance on real examples.
Make Your Own Quizzes and Flashcards
Convert your notes into interactive study material.
Get started for free