Podcast
Questions and Answers
Quod est primum divisio in arboribus decisionis?
Quod est primum divisio in arboribus decisionis?
- Divisio ad sinistram
- Simpliciter in duas partes
- Arbor representata
- Ratio puncta rubra/coerulea (correct)
Quot partes dividuntur in prima divisione arboris decisionis?
Quot partes dividuntur in prima divisione arboris decisionis?
- Octo partes
- Quattuor partes
- Duas partes (correct)
- Tres partes
Quid efficit arbor decisionis in primo gradu?
Quid efficit arbor decisionis in primo gradu?
- Facit decisionem simplex (correct)
- Definit lineas directrices
- Aestimare variabiles
- Generat puncta rubra et coerulea
Quod est effectus divisionis ad sinistram in arboribus decisionis?
Quod est effectus divisionis ad sinistram in arboribus decisionis?
Quomodo arbor decisionis potest repraesentari?
Quomodo arbor decisionis potest repraesentari?
Quae ratio adhibetur ad determinandum regiones in arboribus decisionis?
Quae ratio adhibetur ad determinandum regiones in arboribus decisionis?
Quae est utilitas divisionis secundae in arboribus decisionis?
Quae est utilitas divisionis secundae in arboribus decisionis?
Quod est maximum caput informationis cum dividente?
Quod est maximum caput informationis cum dividente?
Quae est utilitas Gini impuritatis in computatione?
Quae est utilitas Gini impuritatis in computatione?
In quibus condiciones informationis lucrum melius est?
In quibus condiciones informationis lucrum melius est?
Quot moduli impudentiae adhibentur in difficultate identitatis?
Quot moduli impudentiae adhibentur in difficultate identitatis?
Quid significat 'overfitting' in contextu data mining?
Quid significat 'overfitting' in contextu data mining?
Quot sunt exemplaria maximi criterii informationis?
Quot sunt exemplaria maximi criterii informationis?
Quod modum adhibere ubi Gini non sit maxime idoneus?
Quod modum adhibere ubi Gini non sit maxime idoneus?
Quam utilitas implicatur in quantitativa computatione Gini?
Quam utilitas implicatur in quantitativa computatione Gini?
Quod argumentum Gini impuritatis describit probabilitatem?
Quod argumentum Gini impuritatis describit probabilitatem?
Quoties Gini impuritas aequalis est 0?
Quoties Gini impuritas aequalis est 0?
Quis inter hoc non est pars functionis perditionis?
Quis inter hoc non est pars functionis perditionis?
Quid significat G(p) in contextu Gini impuritatis?
Quid significat G(p) in contextu Gini impuritatis?
Quod indicat $p_i$ in formula pro Gini impuritate?
Quod indicat $p_i$ in formula pro Gini impuritate?
Quis inter haec non est causa defectus 0-1 perditionis pro decisionibus arboribus?
Quis inter haec non est causa defectus 0-1 perditionis pro decisionibus arboribus?
Quot classes necessariae sunt ad Gini impuritatem maximam?
Quot classes necessariae sunt ad Gini impuritatem maximam?
Quid est maximum value Gini impuritatis?
Quid est maximum value Gini impuritatis?
Quoties Gini impuritas minimus est 0, qualis est id data set?
Quoties Gini impuritas minimus est 0, qualis est id data set?
Quod significat Gini impuritas cum in binario classificatione valet?
Quod significat Gini impuritas cum in binario classificatione valet?
Quae sequentes gradus adhibentur ad functionem fθ (x) aestimandam?
Quae sequentes gradus adhibentur ad functionem fθ (x) aestimandam?
Quid significat 'greedy optimisation' in constructione arborum?
Quid significat 'greedy optimisation' in constructione arborum?
Quid fit si data in training solum unum genus continent?
Quid fit si data in training solum unum genus continent?
Quod est objective ad selectionem optimarum subdivisionum?
Quod est objective ad selectionem optimarum subdivisionum?
Quid est L(yi, fθ(xi))?
Quid est L(yi, fθ(xi))?
Quod genus amissionis non operatur bene in arboribus decisionum?
Quod genus amissionis non operatur bene in arboribus decisionum?
Quid significat 'recurse' in constructura arborum decisionum?
Quid significat 'recurse' in constructura arborum decisionum?
Quam adhibere ad decidendum quorum divider optimum?
Quam adhibere ad decidendum quorum divider optimum?
Quis est accuratus exitus cum secat = 0.5 et feature = y?
Quis est accuratus exitus cum secat = 0.5 et feature = y?
Quid indicat accuratum exitum 100% in continuis decision stump?
Quid indicat accuratum exitum 100% in continuis decision stump?
Quod est accuratum exitum cum feature = x et secat = -0.5?
Quod est accuratum exitum cum feature = x et secat = -0.5?
Quod principium adhibetur ad spatium dividendum in decision stump continuis?
Quod principium adhibetur ad spatium dividendum in decision stump continuis?
Quid significat accuratum exitum 0%?
Quid significat accuratum exitum 0%?
Quod est accuratum exitum quando feature = y et secat = 0.5?
Quod est accuratum exitum quando feature = y et secat = 0.5?
Quid obstat ad accuratum decisionem in decision stump?
Quid obstat ad accuratum decisionem in decision stump?
Quod est secare magis efficax in data continuis?
Quod est secare magis efficax in data continuis?
Quod realmente adiuvare potest ad accuratum exitus in decision stump?
Quod realmente adiuvare potest ad accuratum exitus in decision stump?
Quomodo describitur accuratus output in decision stump continuis?
Quomodo describitur accuratus output in decision stump continuis?
Quid necessario non est ad parametra optimanda in decision stump?
Quid necessario non est ad parametra optimanda in decision stump?
Quis est exactus modus pro calculando accuratum exitum?
Quis est exactus modus pro calculando accuratum exitum?
Quod elementum non pertinet ad decision stump continuis?
Quod elementum non pertinet ad decision stump continuis?
Quae de labelis in processu notandi verum est?
Quae de labelis in processu notandi verum est?
Quid est significatio ‘semi-supervised’ in contextu discendi machinis?
Quid est significatio ‘semi-supervised’ in contextu discendi machinis?
Quid excluendum est in algorithmis supervisionibus?
Quid excluendum est in algorithmis supervisionibus?
Quae ex sequentibus probas discendi qualitates est probabilistica?
Quae ex sequentibus probas discendi qualitates est probabilistica?
Quid de notis ‘weakly-supervised’ est falsum?
Quid de notis ‘weakly-supervised’ est falsum?
Quid definit ‘active learning’?
Quid definit ‘active learning’?
Quod processus discendi in ‘batch learning’ non includit?
Quod processus discendi in ‘batch learning’ non includit?
Quod affirmatio de modellis graphice est verum?
Quod affirmatio de modellis graphice est verum?
Flashcards
Ratio of red to blue
Ratio of red to blue
Ratio of red to blue indicat color of a region
First split
First split
First split: divide the area in half and then shade the left side using a red and blue point ratio
Recursive splitting
Recursive splitting
Each half split is recursively divided again
Decision tree
Decision tree
Signup and view all the flashcards
Left half split
Left half split
Signup and view all the flashcards
Recursive division
Recursive division
Signup and view all the flashcards
Shading based on ratio
Shading based on ratio
Signup and view all the flashcards
Decision tree process
Decision tree process
Signup and view all the flashcards
Decision stump
Decision stump
Signup and view all the flashcards
Continuous decision stump
Continuous decision stump
Signup and view all the flashcards
Accuracy
Accuracy
Signup and view all the flashcards
Partition
Partition
Signup and view all the flashcards
Feature
Feature
Signup and view all the flashcards
Split
Split
Signup and view all the flashcards
Finding Best Parameters
Finding Best Parameters
Signup and view all the flashcards
Evaluation
Evaluation
Signup and view all the flashcards
Average 0-1 Loss
Average 0-1 Loss
Signup and view all the flashcards
Lucrum informationis
Lucrum informationis
Signup and view all the flashcards
Functio impuritatis
Functio impuritatis
Signup and view all the flashcards
Impuritas Gini
Impuritas Gini
Signup and view all the flashcards
Divisio
Divisio
Signup and view all the flashcards
Entropia
Entropia
Signup and view all the flashcards
Lucrum informationis divisionis
Lucrum informationis divisionis
Signup and view all the flashcards
Divisio heterogena
Divisio heterogena
Signup and view all the flashcards
Informationis quantitas
Informationis quantitas
Signup and view all the flashcards
Decisionis arboris aestimatio
Decisionis arboris aestimatio
Signup and view all the flashcards
Decisionis arbor
Decisionis arbor
Signup and view all the flashcards
Decisionis arboris optimizatio avari
Decisionis arboris optimizatio avari
Signup and view all the flashcards
Nodulus folii
Nodulus folii
Signup and view all the flashcards
Recursionem arboris
Recursionem arboris
Signup and view all the flashcards
Functio damni
Functio damni
Signup and view all the flashcards
Damnum 0-1
Damnum 0-1
Signup and view all the flashcards
Decisionis arboris damni 0-1 fallacia
Decisionis arboris damni 0-1 fallacia
Signup and view all the flashcards
Gini impuritas
Gini impuritas
Signup and view all the flashcards
Lucri informatio
Lucri informatio
Signup and view all the flashcards
0-1 perditus
0-1 perditus
Signup and view all the flashcards
Processus arboris decisionis
Processus arboris decisionis
Signup and view all the flashcards
Data inscripta
Data inscripta
Signup and view all the flashcards
Data non inscripta
Data non inscripta
Signup and view all the flashcards
Supervisio
Supervisio
Signup and view all the flashcards
Non supervisio
Non supervisio
Signup and view all the flashcards
Semi-supervisio
Semi-supervisio
Signup and view all the flashcards
Debiliter supervisio
Debiliter supervisio
Signup and view all the flashcards
Classificatio algorithmorum
Classificatio algorithmorum
Signup and view all the flashcards
Qualitas Responsionis
Qualitas Responsionis
Signup and view all the flashcards
Study Notes
Machine Learning 1.02: Decision Trees
- Decision trees are a supervised classification algorithm.
- They are easy to understand.
- The next lecture will cover random forests, which are an extension of decision trees.
- Random forests were a leading approach in machine learning before deep learning.
What is the goal? I
- The goal is to learn a function, y = f(x), from data.
- x is an n-dimensional feature vector.
- y is an output class from {0, 1, ..., K-1} where K is the number of possible output classes.
- This is a supervised machine learning problem.
What is the goal? II
- The goal is to learn y = f(x) from data, specifically {(X1, Y1), (X2, Y2), ..., (XN, YN)}.
- Choosing the best function, f, requires a loss function.
What is the goal? III
- The task is to learn a function, y = f(x), from data. A set {(X1,Y1), ...,(XN,YN)}.
- A suitable function f(x) , given a choice of parameter, should minimise the total loss calculated via a loss function L(yi, f(xi)).
Decision Stump I
- An algorithm from a previous lecture is discussed, involving evaluating various features and matches to determine the best fit for the data. Parameters are discussed.
Decision Stump II
- The algorithm from the previous lecture is reinterpreted in current notation, introducing parameters, functions including Kronecker delta function and a loss function.
Decision Stump III
- The function space learned by the decision stump is explained.
Continuous decision stump I
- The lecture discusses how to deal with continuous input features.
- A function, f(x), is created for partitioning the space
Continuous decision stump II
- Examples of continuous decision stumps for different splits are demonstrated with associated accuracy results.
Continuous decision stump III
- Procedures are outlined to find optimal splits by sweeping parameters and evaluating loss functions
Continuous decision stump IV
- The limitations of axis-aligned separation in real-world data are highlighted.
Decision trees I
- Recursive splitting is introduced to improve upon the limitations of previous decision stumps.
- The result can be viewed as a tree structure.
Decision trees II
- The parameter of the function is a binary tree.
- Two types of nodes are distinguished in the tree: internal nodes that execute the splits and leaf nodes containing the final answers.
Decision trees III
- A method to evaluate the function fo(x) is explained through the use of the binary tree.
Decision trees IV
- Explains Greedy optimisation of parameters (including brute force at each node), tree construction, and a way to avoid overfitting such as by using techniques such as Limiting the tree depth, Minimum leaf node size.
Decision trees V
- The need for a loss function, L(y_i, f_{θ}(x_i)), is emphasized.
- The 0-1 loss is unsuitable for decision trees but works well for decision stumps.
Gini Impurity
- Gini impurity, a suitable loss function, is explained.
- It measures the probability that two randomly selected items from the data set will have different classes.
- The lowest value of Gini Impurity is 0.0, which implies that all items from the data set have the same class.
- The highest value for Gini Impurity is less than 1.0, where each data point in the data set belongs to a different class.
- A formula for calculating Gini impurity is presented.
Weighting the split
- Explains how to weight the different splits to create a single number.
Information gain
- Information gain measures the information gained by making a particular split in the data.
- It is calculated via a formula involving Entropy.
Which loss function?
- Comparison of the characteristics of Gini impurity and information gain as loss functions.
- Gini impurity is faster to compute than information gain.
Overfitting
- The concept of overfitting in decision trees is explained.
- It occurs when a decision tree learns the details of the training data too well, including random noise.
- The solution is discussed via Early Stopping.
Early stopping
- Methods for avoiding overfitting by stopping early, including limiting tree depth and minimum leaf node size.
- Prevents the decision tree function from becoming too complicated.
- Hyperparameters are introduced as values that control these parameters.
A glossary of ML problem settings
- A brief introduction to supervised machine learning, classification and regression.
Supervised learning: Classification
- Defines supervised classification as learning a function y = f(x) with a discrete output, y.
- Examples such as identifying camera trap animals from images, and predicting voting intention from demographic data are included.
Supervised learning: Regression
- Defines supervised regression as learning a continuous function, y.
- Provides examples such as predicting the critical temperature of a superconductor given material properties, and inferring particle paths in high energy physics experiments.
Supervised learning: Further kinds
- Discusses multi-label classification, where the output variable y is a set (eg. identifying multiple objects in an image), structured prediction (eg. sentence tagging), and other cases where the output is structured.
Unsupervised learning
- Describes unsupervised learning as finding patterns in data without any pre-defined outputs(or labels for the data).
- Provides examples such as clustering (grouping similar data points), density estimation, and dimensionality reduction.
Unsupervised learning: Clustering
- Describes grouping similar data points for various examples including finding co-regulated genes and identifying social groups
- Includes real-world applications and diagrams.
Unsupervised learning: Dimensionality reduction
- Explains dimensionality reduction as a method to reduce the number of variables of input data while preserving important information.
*supervised
- Describes semi-supervised learning and weakly-supervised learning which use fewer labeled data points, but lots of unlabeled data (or cheap inaccurate data).
- Introduces concepts from image recognition as an example.
Glossary
- A summary of different types of classification, regression, and learning methodologies, including supervised, unsupervised and categories for methods.
Further categories
- Describes different categorisation methods, for example, answering style (point versus probabilistic estimate) and workflow (eg. batch, incremental, and active learning).
- Additional examples of categorisation by workflow or area (eg. computer vision or natural language processing (NLP)) are provided.
Summary
- Describes the main topics covered in the presentation, as well as how the information has been presented.
- Includes a summary of topics that this course may touch on further (eg., overfitting techniques).
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
Hic quiz explorat divisiones primarias arborum decisionis et eorum partes fundamentales. Discipuli discunt de effectibus divisionum et ratione adhibita ad determinandas regiones. Utilitatem divisionum secondarum in contextu arborum decisionis simul consideramus.