NN_241024-ocr.pdf
Document Details
Uploaded by MesmerizingGyrolite5380
Ajou University
Full Transcript
Ot~Cfl9l-jjl AJOU UNlVERSITY Neural Networks 의료 인공지능 So Yeon Kim *Slides adapted from Roger Grosse Neural Networks 2 Recap: Traditional machine learning vs Deep learning Model...
Ot~Cfl9l-jjl AJOU UNlVERSITY Neural Networks 의료 인공지능 So Yeon Kim *Slides adapted from Roger Grosse Neural Networks 2 Recap: Traditional machine learning vs Deep learning Model learns features from the data (data-driven) 3 Inspiration: The Brain Our brain has neurons, each of which communicates (is connected) to other neurons 4 Inspiration: The Brain For neural nets, we use a much simpler model neuron, or unit: By throwing together lots of these incredibly simplistic neuron-like processing units, we can do some powerful computations! 5 Single Linear Perceptron A Linear perceptron is an artificial version of a biological one A Linear Perceptron is a basic building block of deep learning Input Linear Combination Activation Output Artifical perceptron 6 Multi Layer Perceptron We can connect lots of units together; units are grouped together into layers. This gives a feed-forward neural network. input s ignal output signa ls input fi rst second output layer hidden hidden layer layer layer source: [Haykin , 2009] 7 Multi Layer Perceptron Each layer connects N input units to M output units. In the simplest case, all input units are connected to all output units. We call this a fully connected layer. The output units are a function of the input units: A multilayer network consisting of fully connected layers is called a multilayer perceptron. 8 Activation function Logistic: suffers from the vanishing gradient problem Tanh: better than logistic Rectified linear unit (ReLU): very popular in deep nets 9 Feature Learning Neural nets can be viewed as a way of learning features: 10 Training Neural Networks 11 Forward propagation Send input 𝑥 via hidden 𝑧 to output ℎ 12 Backward propagation Enables us to train hidden weights Weight update error ~ - - - Backpropagation ()1 ==al -y t X1 Optimization such as Gradient Descent / X2 Calculation of V costfunction ► " O = cr (net) = 1 1+ e - net Output Net input Activation function function Xn.mput 13 Image from http://test.basel.in/product/gradient-descent-back-propagation/ Deep learning = many layers of abstraction As learning goes on, they gradually “discover” salient features characterizing training data They do so by performing nonlinear transformation on input data into new space called feature space 14 Go deeper, more hidden layers {.............~....... y~·.....;-,'p,s oF · uca.ustvE. 01 1 c ,LAssEs wnH.' MOST CtNUiAL ~S~UClUll:E. DEC:~tO:N AEGIOftS ,, 1 !R09tEf.4. M£5MID REGIONS AtiCifON $ltAP£.S tSJNCLf lAY!R ·., · ; " , ,. I NO :,::, ! fa\ i I '\',. Q MM.a: f'LAftl -.~,...... i BOUNDEO 'i.:J I hidden t lIf. _ avl.A..·e ~·... l~ HYPtERP.-.:. I i 0l l lllayer j Q,NIE :.~ TWO lllVi,.-1--- eo,Mvtx ~-.. 8.: Underfitti ng hidden l... I OPl::N............- ' l I, OR I CLOSED llayer fl:£GIONS, ~ 1iHKtE-lAVEFI: TWO AABIJRARV' CCQQ1p&e;u1 r1 hidden l ,mi,uul By i Numbet tif Jli,Offs;.l layers L_____,..__I..---· , --------------~--...._._____, source: [Lippmann, 1987] Overfitting 15 Model complexity and Regularization 16 Model fitting I X O 0 0 0 x 0 o0 o x 0 o0 o X Oo 0 X a OoO X OoO X X xO X xO X 00 X X XX X xX X x· xx Xx X Xx X Under-fitting Appropriate-fitting Over-fitting (too simple to (forcefitting - too explain the good to be true) variance) 17 How to deal with overfitting? Cross-validation or early stopping Dimension reduction Get more data! or Data augmentation Regularization Examples of data augmentation (images) 18 Regularization "Simpler" Hypothesis Cost function = Loss function + regularization term L1 regularization (Lasso) L2 regularization (Ridge) Combination of two (Elastic net) Dropout regularization 19 Dropout regularization Goal: Regularize a neural net to prevent overfitting Randomly set neurons to zero during training, with some probability 𝑝 20 Thank You! Q&A 21