Machine Learning Overview
15 Questions
0 Views

Machine Learning Overview

Created by
@SoulfulDialect

Questions and Answers

Which of the following are three common types of tasks that machine learning can solve?

Classification, Regression, Clustering

What is the main difference between classification and regression tasks in machine learning?

Classification output is discrete class values, while regression output is continuous values.

Which type of learning involves building a model based on unlabeled input data?

Unsupervised Learning

In supervised learning, the program maps all inputs to outputs without training a model first.

<p>False</p> Signup and view all the answers

Regression tasks aim to discover the dependency between attributes by expressing the sample mapping relationship using a ____________.

<p>function</p> Signup and view all the answers

What is the purpose of regularization methods in a predictive algorithm?

<p>To bias the model toward lower complexity and reduce the number of features.</p> Signup and view all the answers

What is a common method of an embedded method process?

<p>LASSO regression</p> Signup and view all the answers

In supervised learning, what does the training set consist of?

<p>Features (attributes), target (label)</p> Signup and view all the answers

What is the purpose of the test set in supervised learning?

<p>New data for evaluating model effectiveness</p> Signup and view all the answers

In the prediction phase of supervised learning, what is the label for the data for Marine from Miami with an age of 45?

<p>Unknown data</p> Signup and view all the answers

What is the purpose of reinforcement learning?

<p>Reinforcement learning models learn from the environment, take actions, and adjust actions based on a system of rewards.</p> Signup and view all the answers

What is the dataset used in the training process called?

<p>Training set</p> Signup and view all the answers

_______ is crucial to models and determines the scope of model capabilities.

<p>Data</p> Signup and view all the answers

Features in machine learning models are usually numeric representations of input variables.

<p>True</p> Signup and view all the answers

Match the feature selection method with its description:

<p>Filter = Independent of models, evaluates feature-target correlation Wrapper = Uses a prediction model to score a feature subset Embedded = Treats feature selection as part of the model process</p> Signup and view all the answers

Study Notes

Machine Learning Overview

  • Machine learning is often combined with deep learning methods to study and observe AI algorithms.
  • A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Machine Learning Algorithms

  • Machine learning algorithms are used to solve complex problems, or those involving a large amount of data whose distribution function cannot be determined.

Differences Between Machine Learning Algorithms and Traditional Rule-based Methods

  • Rule-based methods use explicit programming to solve problems, whereas machine learning algorithms automatically learn rules from data.

When to Use Machine Learning

  • Machine learning provides solutions to complex problems, or those involving a large amount of data whose distribution function cannot be determined.
  • Consider using machine learning when task rules change over time, or data distribution changes over time and programs need to adapt to new data constantly.

Rationale of Machine Learning Algorithms

  • The objective function f is unknown, and the learning algorithm cannot obtain a perfect function f.
  • Hypothesis function g approximates function f, but may be different from function f.

Main Problems Solved by Machine Learning

  • Machine learning can solve many types of tasks, including classification, regression, and clustering.

Types of Machine Learning

  • Supervised learning: the program takes a known set of samples and trains an optimal model to generate predictions.
  • Unsupervised learning: the program builds a model based on unlabeled input data.
  • Semi-supervised learning: the program trains a model through a combination of a small amount of labeled data and a large amount of unlabeled data.
  • Reinforcement learning: the learning system learns behavior from the environment to maximize the value of reward (reinforcement) signal function.

Machine Learning Process

  • The machine learning process involves data preparation, model training, model evaluation, and model deployment.

Important Machine Learning Concepts

  • Dataset: a collection of data used in machine learning tasks, where each piece of data is called a sample.
  • Training set: dataset used in the training process, where each sample is called a training sample.
  • Test set: dataset used in the testing process, where each sample is called a test sample.

Data Overview

  • A typical dataset consists of features and labels.

Importance of Data Processing

  • Data is crucial to models and determines the scope of model capabilities.
  • Data preprocessing involves data filtering, data loss handling, handling of possible error or abnormal values, merging of data from multiple sources, and data consolidation.

Data Cleansing

  • Data preprocessing involves the following operations: data filtering, data loss handling, handling of possible error or abnormal values, merging of data from multiple sources, and data consolidation.

Dirty Data

  • Raw data usually contains data quality problems, including incompleteness, noise, and inconsistency.

Data Conversion

  • Preprocessed data needs to be converted into a representation suitable for machine learning models.
  • Data conversion involves encoding categorical data into numerals, converting numeric data into categorical data, and feature engineering.

Necessity of Feature Selection

  • Feature selection is necessary to simplify models, shorten training time, and improve model generalization.

Feature Selection Methods

  • Filter methods are independent of models during feature selection.### Feature Selection Methods

  • Filter methods evaluate each feature by scoring them using a statistics measurement and then sort them by score.

  • Filter methods can preserve or eliminate specific features.

  • Common filter methods include:

    • Pearson correlation coefficient
    • Chi-square coefficient
    • Mutual information
  • Limitations of filter methods:

    • Tend to select redundant variables because they do not consider relationships between features.

Wrapper Methods

  • Wrapper methods use a prediction model to score a feature subset and treat feature selection as a search issue.
  • Wrapper methods evaluate and compare different combinations of features.
  • Common wrapper method:
    • Recursive feature elimination
  • Limitations of wrapper methods:
    • Train a new model for each feature subset, which can be computationally intensive.
    • Provide high-performance feature sets for a specific type of model.

Embedded Methods

  • Embedded methods treat feature selection as a part of the modeling process.
  • Regularization is the most common type of embedded method.
  • Regularization methods introduce additional constraints into the optimization of a predictive algorithm to bias the model toward lower complexity and reduce the number of features.
  • Common embedded method:
    • LASSO regression

Supervised Learning Example

  • The learning phase involves training a classification model to determine whether a person is a basketball player based on specific features.
  • Features (attributes) include:
    • Service data
    • Name
    • City
    • Age
  • Target (label) is "yes" or "no" indicating whether a person is a basketball player.
  • The model is trained on a training set and evaluated on a test set.
  • In the prediction phase, the model is applied to new data to determine whether a person is a basketball player.
  • The model uses each feature or set of features to provide a judgment basis for the prediction.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Description

Learn about machine learning algorithms, types, and process. Understand key concepts like hyperparameters, gradient descent, and cross-validation.

More Quizzes Like This

Use Quizgecko on...
Browser
Browser