ANNs (2) Slides PDF

ANN’s (2) Training Networks It has been mathematically proven that for every computable function, there is a three-layered network that can compute it Ann’s are as powerful as Turing machines But: the fact that for every solvable problem there is an ANN that can solve it does not tell us what that ANN is like It can be mind boggling complex to engineer networks to compute even relatively simple functions The goal is to take an ANN that doesn’t solve the problem and train it to compute the solution Single-unit networks Single-unit networks can function as logic gates They can compute basic binary Boolean functions Because of this, networks of single-unit networks can compute any Boolean function whatsoever Boolean functions Named after the mathematician and logician George Boole – inventor of Boolean algebra, Boolean functions etc. Functions from sets of truth values to truth values Truth values are TRUE and FALSE Boolean functions can be of any (finite) arity 0-ary function (e.g. the TRUE) 1-ary function (e.g. NOT) binary function (e.g. AND) Boolean Functions Boolean functions can be represented by truth tables A B A AND B FALSE FALSE ? FALSE TRUE ? TRUE FALSE ? TRUE TRUE ? AND, NOT, and OR are all Boolean functions Single Unit Networks If we represent TRUE by 1 and FALSE by 0 then we can use single-unit networks to represent Boolean functions The arity of the function is given by the number of inputs to the unit The weights, activation functions, and threshold need to be set so that the output is always 1 or 0 use a binary threshold activation function Single-layer network representing the Boolean functions AND & NOT Learning in Neural Networks NN are important because they allow us to model how information – processing capacities are leant. If we abstract away from learning, networks of single unit networks are simply implementations of symbolic systems. Two types of learning: Supervised (requires feedback) Unsupervised (no feedback) Unsupervised Learning Simplest algorithms for unsupervised learning are forms of Hebbian learning Basic principle: Neurons that fire together, wire together “When an axon of a cell A is near enough to excite cell B or repeatedly or persistently takes part in firing it, some growth or metabolic change takes place in both cells such that A’s efficiency, as one of the cells firing B, is increased.” Hebbian learning Standardly used in pattern associator networks Very good at generalizing patterns Also feature in more complex learning rules Perceptron Convergence rule AKA delta rule Distinct from Hebbian learning in that training depends upon the discrepancy between actual output and intended output = error measure (Intended output – actual output) Applying delta rule Delta rule gives algorithm for changing threshold and weights as a function of  and  (a learning rate constant) T = –x Wi =  x  x Ii Perceptron convergence theorem The perceptron convergence rule will converge on a solution in every case where a solution is possible i.e. it will generate a set of weights and a threshold that will compute every Boolean function that can be computed by a perceptron (i.e. a single layer network) Multi-layer Networks – the basic problem Multilayer networks can be constructed to compute any Turing-computable function, but cannot be trained using the perceptron convergence rule Single unit networks can be trained, but can only compute linearly separable functions What was needed was an algorithm for training multilayer networks Backpropogation a.k.a. Generalized delta rule Information is transmitted forwards through the network Error is propagated backwards through the network The backpropagated error signal is used to adjust the weights to/from the hidden units Backprop The algorithm needs to find a way of calculating error in hidden units that do not have target activation levels It does this by calculating for each hidden unit its degree of “responsibility” for error at the output units This error value is used to adjust the weights of the hidden units How successful is backprop? Multilayer networks can compute any Turing- computable function But it is not true that backpropagation will always converge on a solution Unlike perceptron convergence rule, which is guaranteed to find a solution where there is one Backprop: Biological plausibility No evidence that backpropagation takes place in the brain No evidence that individual neurons receive error signals from all the neurons to which they are connected Very little biological learning seems to involve supervised networks PSSH vs. Connectionism The physical symbol system hypothesis: a physical symbol system has the necessary and sufficient means for intelligent action Connectionism: a connectionist network has the necessary and sufficient means for intelligent action Pluralism We are a long way from fully understanding the mind The ANN and PSS frameworks are each illuminating and useful in their own ways Let’s keep exploring the frameworks until we settle on one, or come to a better understanding of how they converge

ANNs (2) Slides PDF

Document Details

Tags

Related

Summary

Full Transcript