Decision Trees in AI

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Decision tree algorithms fall under which category of machine learning?

  • Unsupervised learning
  • Supervised learning (correct)
  • Semi-supervised learning
  • Reinforcement learning

Decision tree algorithms can only be applied to classification problems and not regression problems.

False (B)

Which of the following best describes decision trees?

  • A white box containing a set of rules. (correct)
  • A black box containing a set of rules.
  • A support vector machine.
  • A neural network with multiple hidden layers.

In a decision tree, the ______ of the tree test the attributes.

<p>nodes</p> Signup and view all the answers

In the context of decision trees, what do the 'leaves' of the tree represent?

<p>Classes (D)</p> Signup and view all the answers

Describe the difference between binary classification and multi-class classification in the context of decision trees.

<p>Binary classification involves classifying data into two classes, while multi-class classification involves classifying data into more than two classes.</p> Signup and view all the answers

In decision trees, there is only one possible correct decision tree for a given dataset.

<p>False (B)</p> Signup and view all the answers

Which of the following is the primary goal when constructing a decision tree?

<p>To create a simple tree. (B)</p> Signup and view all the answers

What does minimizing the expected number of tests help to achieve in the context of decision trees?

<p>A simpler tree (A)</p> Signup and view all the answers

The measure of ______ is used to minimize the expected number of tests when classifying in decision trees.

<p>impurity</p> Signup and view all the answers

What is the purpose of calculating information gain?

<p>To find the attribute that best splits the data (B)</p> Signup and view all the answers

Define 'entropy' in the context of decision trees and information theory.

<p>Entropy measures the amount of uncertainty or randomness in a set of data.</p> Signup and view all the answers

A higher entropy value signifies less uncertainty in a dataset.

<p>False (B)</p> Signup and view all the answers

What does a low information gain suggest about an attribute?

<p>It does not provide much information for classification. (C)</p> Signup and view all the answers

The attribute with the ______ information gain is chosen as the splitting attribute.

<p>highest</p> Signup and view all the answers

Match the following terms with their descriptions in the context of decision trees:

<p>Node = Represents a decision based on an attribute Branch = Represents the outcome of a test Leaf = Represents the final classification or prediction Root = Represents the initial attribute for splitting</p> Signup and view all the answers

What is the first step in constructing a decision tree?

<p>Preparing the dataset (C)</p> Signup and view all the answers

Once a decision tree model is developed, it cannot be used for prediction on new data.

<p>False (B)</p> Signup and view all the answers

During the data preparation phase for a decision tree, what is a typical task done?

<p>All of the above (D)</p> Signup and view all the answers

Name two algorithms used to construct decision trees.

<p>ID3 and CART</p> Signup and view all the answers

Flashcards

Supervised Algorithms

Algorithms that learn from labeled training data, used for classification or regression.

Regression Problem

Predicting a continuous value, like temperature or credit amount.

Classification Problem

Predicting a category, like whether an email is spam or not.

Decision Trees

Classifiers for entities in an attribute/value format.

Signup and view all the flashcards

Nodes

Test attributes in a decision tree.

Signup and view all the flashcards

Leaves

Indicates the classes in a decision tree.

Signup and view all the flashcards

Binary Classification

Two possible classes

Signup and view all the flashcards

Multi-class Classification

More than two possible classes

Signup and view all the flashcards

White Box

Represents a set of rules.

Signup and view all the flashcards

Dataset

A collection of data used for building a model.

Signup and view all the flashcards

Data Division

Splitting data for training and evaluation.

Signup and view all the flashcards

Random Attribute Selection

Randomly choosing attributes for decision tree construction.

Signup and view all the flashcards

Algorithmic Attribute Selection

Attribute selection using a specific algorithm.

Signup and view all the flashcards

ID3

Iterative Dichotomiser 3

Signup and view all the flashcards

Data Pure

Pure data with elements in the same class.

Signup and view all the flashcards

Data non Pure

Impure data with elements in different classes.

Signup and view all the flashcards

Best Attribute

Finds the most informative attribute.

Signup and view all the flashcards

Entropy

Measure of randomness or uncertainty in a dataset.

Signup and view all the flashcards

Information Gain

Expected reduction in entropy.

Signup and view all the flashcards

Tests Minimization

Algorithm minimizing the number of tests.

Signup and view all the flashcards

Study Notes

  • This document is a course about decision trees in Artificial Intelligence.
  • It is for Licence SSD & MID level.
  • The course is prepared and taught by Pr. Sanae KHALI ISSA.

Introduction

  • Decision tree algorithms are an example of supervised machine learning algorithms.
  • They are applied to classification or regression problems, depending on the type of variables.
  • Continuous variables are for regression problems.
  • Categorical variables are for classification problems.
  • Examples of regression problems include predicting sales, the amount of a credit, and the temperature.
  • Examples of classification problems include predicting if a person is diabetic, if an email is spam, who a citizen might vote for and the club supported by a person.
  • Decision tree algorithms are popular for classification problems.
  • An example shows how to construct a knowledge base to predict the state of a new patient (sick or healthy) based on symptoms.
  • A table shows the patient's temperature, sore throat, and whether they are sick.
  • A decision tree shows how to classify patients based on these attributes.

Algorithm Principle

  • Decision trees are classifiers for entities represented in an attribute/value format.
  • The nodes of the tree test the attributes.
  • There is a branch for each value of the tested attribute.
  • The leaves indicate the classes.
  • Binary classification has two classes.
  • Multi-class classification has multiple classes.
  • A decision tree is a white box containing a set of rules.
  • Based on the values of an attribute, you can know which class an element belongs to.
  • The classes C1, C2, C3, C7, C8, and C9 are shown, with V1, V2 being the values of attribute A1 and V'1, V'2, V'3 are the values of attribute V2
  • Several rules show class determination logic, Si (A1=V1) & (A2 =V'1) alors C1 etc...

Decision Tree Example

  • Attributes or variables are features used to make decisions.
  • A table shows the attribute like the player note (Bon, Moyen, Mauvais) and the recorded result (Gagné, Nul).
  • A decision tree illustrates how to classify players based on their note to predict the result.

Decision Tree Construction steps

  • Step 1: Find a dataset and prepare the data for division into training and test sets.
  • Step 2: Create a model by defining the general structure of the tree, specifying the root, branches, and leaves.
  • Step 3: Apply the developed model to make predictions.

Decision Tree Construction

  • To construct a decision tree, you should decide how to select the attribute that represents the root and the order of other attributes.
  • Method 1 is random selection of the attributes.
  • Method 2 is selecting attributes by applying a precise algorithm, such as ID3 (Iterative Dichotomiser 3), C4.5, C5 (successors of ID3), or CART (Classification And Regression Tree).

Attributes Selection

  • Algorithm for constructing the tree is DT (T, E)
  • If all examples of E are in the same class (Data pure), then assign the label Ci to the current node
  • Else (Data non pure), select the best attribute A with values v1, v2, ..., vn
  • Partition E according to v1, ..., vn into E1, ..., En
  • For j=1 to n DT(Tj, Ej)
  • With A={V1, V2, ..., Vn} and E=E1 U... U En

Attributes Selection Example

  • In this example, a dataset allows to predict whether a game of tennis will take place depending on the following parameters: Sky, Temperature, Humidity and Wind
  • The values for Jouer are either NON or OUI.
  • E={e1, e2, 23, 24, 25, eb, e7, 28, e9, e10, e11, e12, e13, e14}
  • The attributes are Sky, Temperature, Humidity, Wind
  • One approach to selecting attributes is random selection
  • Splitting on Sky creates three subsets, with branches for Soleil, Couvert, and Pluie.
  • Humidite and Vent are used to divide the subsets again.
  • A short exercise, Construire un arbre de décision pour le même problème en choisissant comme racine cette fois l'attribut Température (Construct a decision tree for the same problem, with the root of the tree being "Température").
  • The goal is to find l'attribut le plus représentatif (The most representative attribute).

Attributes Selection Solution

  • The "Température" tree is shown.
  • The branches are "Frais", "Bon", and "Chaud".
  • Possibility of constructing several correct decision trees, that may be simple or complicated
  • Objective: construct a simple tree
  • The attribute to select is the one that allows a simple tree
  • Method: Minimizing the mathematical expectation/hope of the number of tests in order to classify a new object by means of the impurity measure

Impurity Measure Definition (Entropy)

  • Boltzmann's entropy in thermodynamics and generalized Shanoon's entropy are discussed.
  • Shanoon proposed it in 1949 to measure the entropy values for discrete probability distributions.
  • It expresses the quantity of information, that is, the number of bits needed to specify the distribution.
  • The formula for calculating information entropy is: I= -Σ Pi X log2 (Pi)
  • Here Pi is the probability of class Ci/set of data.
  • Exemple de calcul d'entropie (Example of calculation of the entropy)
  • Using the weather data with a Jouer attribute from the previous example.
  • n : number of examples | n1 : number of elements belonging to the NON class | P1: probability of the NON class | n2: number of elements belonging to the YES class | P2: the probability of the YES class.
  • Equation: I=- Σ Pi X log2 (Pi) aveck: nombre de classes (with k : nb of classes)
  • Results: I= 0,940 bits

Purity Measurement Solution

  • l'ensemble de données se compose de (the datasSet is made off): 5 triangle and 9 square
  • Calcul de l'entropie (compute the entropy)
  • I=- Σ Pi X log2 (Pi)
  • I= - P(triangle) x log2 (triangle) - P(square) x log2(square)
  • I= -5/14 x log2(5/14) - 9/14 x log2(9/14)
  • I= 0,940 bits

Entropy Gain Computation

  • Entropy gain associated with an attribute:
  • The best attribute is the one that maximizes the entropy gain
  • To compute the entropy gain asscociated with the attribute name A, the equation is, Gain (S, A) = I (S) – Ires(A)
  • Avec (with) Ires(A) = ∑ P(v) I(v), avec v appartient à A (with V belonging to A)
  • Legend; S: ensemble de jeu de données (dataset) | I(S): entropie de l'ensemble de jeu de données (dataSet entropy)| A: nom de l'attribut (attribute name) | Ires (A): le résiduel de A (Ires(A): the residual of a A | P(v): probabilité des classes dans l'ensemble des données (probability of the courses within the dataset) | v: ensemble des valeurs de l'attribut A (v: name of the label of A | I(v): entropie de v (I(v): entropy og v).

Entropic gain computation

  • S = {1, 2, ..., 14} | I(S) = 0,940 bits (already computed)
  • l'attribut Color
  • v1, v2, v3 sont les valeurs de l'attribut Color = { green, yellow, red} ( v1, v2, v3 are the possible value of the attribute)
  • P(v1) = P(green) = 6/14 | P(v2) = P(yellow) = 3/14 | P(v3) = P(red) = 5/14
  • Ires (Color) = P(v1) x I(v1) + P(v2) x I(v2) + P(v3) x I(v3)
  • Gain( S, Color) = 0,428

Example: Entropy Gain Linked Attribute Outline

  • Using S = {1, 2, ..., 14}, and already calculated I(S) = 0,940 bits
  • For the attribute Outline, possible values are dashed, solid
  • Given P(dashed) = 7/14 and P(solid) = 7/14
  • Gain(S, Outline) = I(S) - Ires(Outline) and Ires(Outline) = P(dashed) * i(dashed) + P(solid) * i(solid)
  • Gain(S, Outline) = 0,940 - 0,789 =0,151 bits.

Example: Entropy Gain Linked To "Dot" Attribute

  • S is dataset one examples (0,940 bits)
  • Attribute dot has 2 options "yes" and "no'
  • Then P(yes) is 6/14 and P(No) = 8/14, which are used to compute Gain(S,Dot)

Calculating Gain

  • Gain is, by result analysis found the attribute "Color" (with 0,428) gives best split of examples
  • So make tree from that.
  • Refaire le même calcul pour partitionner les deux ensembles E2 et E3 et en calculant (Redo math for partition of E2 and E3 examples sets by computation)
  • Gain(Outline) = ?
  • Gain (Dot) = ?

Calculating gain solution

  • Si Color = red and Dot= yes | alors shape = triangle (then shape = triangle) | the others ways, shape = squarre.
  • The results gain for E3 examples and dot (E3, dot) found maximal dot gain to 0,0971.
  • si Color = green and Outline = dashed, the result shape to triangle. Other wise if colors = green and lines its solid example the result of shape the example square.
  • Final step to find the Maximal to be complete set of computation to result gain from the examples

Final Algorithm Summary

  • The final Decision Tree structure looks like with these attributes and classifications:
  • Root node Color splits into red, yellow, and green.
  • The tree leads to complete classification of the shapes

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Decision Trees in AI and ML Quiz
3 questions
Decision Trees in Machine Learning
14 questions
Machine Learning Decision Trees Quiz
45 questions
Use Quizgecko on...
Browser
Browser