Artificial Intelligence and Machine Learning
40 Questions
5 Views

Artificial Intelligence and Machine Learning

Created by
@AppreciatedGoblin2889

Podcast Beta

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which statement best defines machine learning?

  • A branch of AI focused on explicit programming of tasks.
  • A technology that processes information without the need for data.
  • A field in AI that studies algorithms that can learn from data and generalize. (correct)
  • A method for transferring human intelligence to computers.
  • What characterizes supervised learning?

  • It involves trial and error to generate data.
  • Data is provided in a structured form with labels. (correct)
  • It requires fixed outputs for every potential input.
  • Data is presented without any labels.
  • In which application is unsupervised learning utilized?

  • Self-driving cars making decisions using labeled scenarios.
  • Email spam filtering based on keywords.
  • Chat-GPT generating text without predefined labels. (correct)
  • Image classification using predefined categories.
  • What is a key feature of reinforcement learning?

    <p>It involves creating databases through trial and error.</p> Signup and view all the answers

    What is a confusion matrix used for in machine learning?

    <p>To visualize the performance of a classification model.</p> Signup and view all the answers

    Which approach involves finding the least error in predictions?

    <p>Using an error function to optimize model accuracy.</p> Signup and view all the answers

    What is an example of the first widespread use of machine learning?

    <p>Email spam filters developed in the 1990s and 2000s.</p> Signup and view all the answers

    What mathematical basis underpins the operations of computers in machine learning?

    <p>Statistics and mathematical functions.</p> Signup and view all the answers

    What is the purpose of dropping the 'petal length' column during parameter selection?

    <p>To prevent overfitting in the model.</p> Signup and view all the answers

    When splitting the dataset using train_test_split, which parameter controls the randomness of the shuffle?

    <p>random_state</p> Signup and view all the answers

    What does the mean_absolute_percentage_error function measure in the context of a regression model?

    <p>The average percentage error of the predictions.</p> Signup and view all the answers

    How does a decision tree determine which feature to split on at the root node?

    <p>Based on the feature with the most distinct classes.</p> Signup and view all the answers

    What does reg.predict(X_test) do in the context of model training?

    <p>It predicts the target values for the test dataset.</p> Signup and view all the answers

    Why might a model exhibit better error values on training data compared to test data?

    <p>The model is likely overfitting to the training data.</p> Signup and view all the answers

    In machine learning classification tasks, what would be the most informative feature among the given options?

    <p>The fruit is 15cm long.</p> Signup and view all the answers

    What is the primary goal when choosing a PC1-axis in PCA?

    <p>To maximize the variance of the projected points</p> Signup and view all the answers

    What is meant by a 'binary tree' in the context of decision trees?

    <p>A tree with two branches for each node representing two categories.</p> Signup and view all the answers

    What does it indicate if a set of datapoints has a high variance?

    <p>There is a large spread among the datapoints</p> Signup and view all the answers

    Why is the factor (n-1) used when calculating variance?

    <p>To ensure an unbiased estimate of variance</p> Signup and view all the answers

    What relationship does the angle between PC1 and another variable's axis indicate?

    <p>The importance of the variable in relation to PC1</p> Signup and view all the answers

    What does the term 'perpendicular' refer to in the context of PCA?

    <p>The relationship between two dimensions</p> Signup and view all the answers

    In PCA, what is typically visualized to suggest input feature importance?

    <p>The angles between the principal components and the features</p> Signup and view all the answers

    Which statement correctly describes the effect of iteratively rotating the PC1 axis in PCA?

    <p>It redistributes the importance of the original features</p> Signup and view all the answers

    How is the projection of data points onto the PC1-axis optimized?

    <p>By minimizing distances of points to the PC1-axis</p> Signup and view all the answers

    What does the term 'garbage in, garbage out' in data pre-processing refer to?

    <p>The need to ensure quality data before analysis.</p> Signup and view all the answers

    In the context of linear regression, what is the primary goal when fitting the function y = kx + m?

    <p>To minimize the total error across all training data points.</p> Signup and view all the answers

    What is the effect of overfitting a model?

    <p>The model will fit the training data too closely and may not generalize well.</p> Signup and view all the answers

    What is Mean Square Error (MSE) used for in evaluating model performance?

    <p>To calculate the average of squared errors across test samples.</p> Signup and view all the answers

    How do artificial neural networks differ from linear regression?

    <p>Neural networks can incorporate non-linear relationships between inputs and output.</p> Signup and view all the answers

    What is the baseline approach when a value needs to be replaced or is missing in a dataset?

    <p>Replacing the value with the mean of the remaining features.</p> Signup and view all the answers

    What distinguishes non-linear regression from linear regression?

    <p>Non-linear regression includes non-linear functions affecting the output.</p> Signup and view all the answers

    When using the training/testing split of 80/20, what is the purpose of this approach?

    <p>To reserve a portion of data for testing the model's performance.</p> Signup and view all the answers

    What is hyperparameter grid search used for in random forest models?

    <p>To find the best setting of parameters that maximizes test accuracy</p> Signup and view all the answers

    What role does the validation set play in machine learning?

    <p>It is used for hyperparameter tuning during the model training</p> Signup and view all the answers

    In logistic regression, how is the baseline established?

    <p>By finding the closest match to a sample from the training data</p> Signup and view all the answers

    What does 'stratify' do in the context of splitting data into training and test sets?

    <p>Guarantees that both sets have an equal representation of each class</p> Signup and view all the answers

    How can one reduce the risk of overfitting in a model?

    <p>By implementing PCA to focus on the most important features</p> Signup and view all the answers

    What is one way to focus training on tricky classes in classification tasks?

    <p>Increasing the sample size for the more challenging classes</p> Signup and view all the answers

    Why might a model perform well on test data but poorly on training data?

    <p>The model is overly complex and not generalized</p> Signup and view all the answers

    What is one benefit of using data augmentation?

    <p>It helps create more balanced data sets by generating new samples</p> Signup and view all the answers

    Study Notes

    Artificial Intelligence

    • Artificial Intelligence (AI) aims to use computers to mimic human intelligence.
    • Machine Learning (ML) is a subset of AI, focusing on algorithms that learn from data and generalize to unseen data.
    • ML can be categorized into three types:
      • Supervised Learning: Data is given with labels.
      • Unsupervised Learning: Data is given without labels.
      • Reinforcement Learning: Data is generated by trial and error.

    Machine Learning Algorithms

    • Classic ML algorithms include Decision Trees, Random Forest, Train-test split, and Confusion matrix.
    • Artificial Neural Networks (ANNs) are often referred to as “Deep Learning”.

    Unsupervised Learning

    • Examples of Unsupervised Learning include:
      • Chat-GPT: No labels are provided. The model seeks patterns in data, aiming to understand and visualize the natural language.

    Supervised Learning

    • Examples of Supervised Learning include:
      • Image Recognition: The model has labeled data and is trained to approximate a 2D plot of the images using ML, aiming to create the most accurate representation possible.

    Reinforcement Learning

    • Examples of Reinforcement Learning include:
      • Self-Learning Autonomous Cars: Models learn by trial and error, generating their own database through interactions with the environment.

    Email Spam Filters

    • Spam filters utilize ML to identify spam messages.
    • The 1990s and 2000s saw a wide adoption of ML for email spam filtering.
    • Each email is labeled as “spam” or “no spam”, forming a dataset for training.

    Computer Intelligence

    • Computers use binary code (0 and 1) for representing data.
    • Numerical operations, mathematical functions, statistics, and programming are fundamental aspects of computer intelligence.
    • The speed and memory capabilities of computers allow them to excel in complex calculations and data processing.

    Typical ML Intelligence

    • Models are evaluated by assessing their accuracy on a test dataset.
    • The error caused by the model is a metric that indicates performance.
    • Finding the minima of the error function helps minimize the model's prediction error.

    Principal Component Analysis (PCA)

    • PCA is used to reduce the dimensionality of data.
    • It transforms data into a lower number of principal components (PCs).
    • The first principal component (PC1) is the direction that maximizes the variation of projected data points.
    • Each subsequent PC is orthogonal to the previous.
    • The angle between PC1 and the input features reveals the importance of each feature.
    • PCA allows for visualization of high-dimensional data and identification of key features.

    Examples of PCA Applications

    • Human-level control in deep reinforcement learning.
    • Analysis of E.Coli presence in sea beaches.
    • Explainable AI (XAI) using PCA for visualization of complex neural networks.

    Data Preprocessing

    • Data preprocessing is an important step that enhances data quality.
    • It involves identifying and handling outliers, creating a baseline or imputation for missing values, and splitting data into training and testing sets.

    Linear Regression

    • A classic statistical technique used in regression analysis to predict continuous variables.
    • The goal is to fit a straight line to the data using the least squares method, minimizing the sum of squared errors.
    • The parameters (k and m) of the line are adjusted to minimize the total error, measured by error metrics such as MSE (Mean Square Error) and MAE (Mean Absolute Error).

    Non-Linear Regression

    • Allows for predicting non-linear relationships between variables.
    • The model uses non-linear functions, such as exponential or logarithmic functions, to capture the curvature of the data.

    Python Code Example

    • The example demonstrates the implementation of linear regression in Python with the Scikit-learn library.
    • It predicts the petal length of a flower based on its other features.

    Classification

    • Decision Trees are used for classification problems, splitting the data based on features, creating a binary tree.
    • Random Forest adds randomization by splitting data into multiple Decision Trees, resulting in a more robust and less prone to overfitting model.

    Logistic Regression

    • A powerful classification model that predicts the probability of a sample belonging to a particular class.
    • It's often used in binary classification problems where the output is a probability between 0 and 1.

    Baseline Models

    • Baseline models serve as a reference point for evaluating the performance of other more complex models.
    • These models are typically simple, such as the nearest neighbor (KNN) with k=1, using all features or only one feature.

    MNIST Digit Recognition

    • The example demonstrates the use of Machine Learning for recognizing handwritten digits in the MNIST dataset.
    • It also focuses on strategies for improvement, such as prioritizing tricky classes, reducing overfitting, and performing data augmentation.

    Overfitting

    • Occurs when a model performs well on the training data but struggles with unseen test data.
    • This can be due to a complex model, too many features, or a limited amount of training data.

    Addressing Overfitting

    • Techniques to reduce overfitting include:
      • Reducing the number of features using PCA or feature selection methods.
      • Increasing the size of the training dataset.
      • Applying data augmentation to generate realistic artificial data.

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Related Documents

    Inför delprov 1.pdf

    Description

    Explore the fundamental concepts of Artificial Intelligence and its subset, Machine Learning. This quiz covers types of machine learning, classic algorithms, and practical examples from supervised and unsupervised learning. Test your knowledge on how AI mimics human intelligence and the intricacies of different ML approaches.

    Use Quizgecko on...
    Browser
    Browser