Model fitting, the perceptron and backpropagation (lectures 4-5-6)

Study Notes

All models are approximations of the world and are never 100% accurate, but they can be useful for understanding specific problems.
Descriptive models aim to fit the data and provide insights about the data itself.
Process models provide information about the underlying process and can be generative, but are harder to formulate.

If two models accurately describe the data, the simpler one is preferred.
This is because simpler models are more likely to generalize and less likely to overfit the data.

Overfitting occurs when a model has too many parameters and fits the noise in the training data, rather than the underlying pattern.
This can be avoided by using cross-validation, information criteria, and Bayesian information criteria.

Maximum likelihood is based on Bayes' theorem and combines evidence with prior expectations.
It involves finding the parameters that maximize the likelihood of the data.

An optimizer, also known as a minimizer, finds the best parameters for a model.
Simple optimizers perform grid search, which can be computationally expensive.
Gradient descent is a more efficient optimizer that quickly finds the minimum log likelihood.

Accuracy: how well the model fits the data.
Understanding: how well the model's components are understood and relate to the predicted outputs.

A perceptron is a simple model of a neuron that takes an input, scales it by a weight, and applies an activation function.
The perceptron is a classification model that outputs a binary decision.
Neural networks are composed of multiple perceptrons and are organized in a hierarchical manner, with lower levels being more sensory and higher levels being more integrative.
Feedback connections are important in neural networks and are involved in predictive coding.

Artificial neural networks are inspired by the brain, but are simplified to focus on input-output transformations.
They do not capture the temporal dynamics and spatial complexity of real neural networks.
They are used for classification and other tasks, and are trained using forward propagation, backpropagation, and repeat.