Machine Learning Concepts Quiz
40 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is the primary goal of market segmentation?

  • Increase the size of the consumer market
  • Simplify product offerings for consumers
  • Divide the market into groups based on similar responses to marketing (correct)
  • Generate a single marketing message for all consumers
  • In matrix factorisation, what is typically true about the sizes of factor matrices U and V?

  • Both must be of the same size
  • k must be less than both n and p (correct)
  • U must be larger than V
  • k must be equal to n and p
  • Which of the following applications is NOT commonly associated with anomaly detection?

  • Manufacturing quality control
  • Network intrusion detection
  • Market segmentation (correct)
  • Fraud detection
  • What is the main objective of reinforcement learning?

    <p>Develop a mapping from states to actions to maximize rewards</p> Signup and view all the answers

    Which method is particularly effective for structured data problems in machine learning?

    <p>Gradient boosting and tree-based methods</p> Signup and view all the answers

    What characteristic defines structured data compared to unstructured data?

    <p>Structured data has a predictable format</p> Signup and view all the answers

    In the context of matrix factorisation, what is the purpose of dimensionality reduction?

    <p>To simplify the dataset while retaining essential information</p> Signup and view all the answers

    Which of the following best describes anomaly detection?

    <p>A technique for detecting significantly different data points</p> Signup and view all the answers

    Why is it necessary to scale inputs before running the kNN algorithm?

    <p>To make the Euclidean distance calculation valid</p> Signup and view all the answers

    What is a limitation of using linear regression compared to kNN?

    <p>It can only make predictions based on a linear function</p> Signup and view all the answers

    What characterizes the kNN regression method?

    <p>It is based on the similarity of examples for predictions</p> Signup and view all the answers

    Which of the following is true regarding the curse of dimensionality in kNN?

    <p>It hampers performance with many predictors</p> Signup and view all the answers

    What is an advantage of linear regression over kNN?

    <p>It has low variance and is highly interpretable</p> Signup and view all the answers

    Which statement about kNN is false?

    <p>It is highly interpretable and user-friendly</p> Signup and view all the answers

    In what way does choosing different values of k impact the kNN model?

    <p>It influences the smoothness of the predictive function</p> Signup and view all the answers

    What is a consequence of using kNN with a high number of predictors?

    <p>It can cause breakdown due to dimensionality issues</p> Signup and view all the answers

    What is the primary range that max-abs scaling focuses on?

    <p>[-1, 1]</p> Signup and view all the answers

    Which transformation would be most effective for reducing right skewness in data?

    <p>Log transformation</p> Signup and view all the answers

    What is a primary advantage of robust scaling over other scaling methods?

    <p>It performs better with outliers.</p> Signup and view all the answers

    In which situation might you want to create a dummy variable?

    <p>When a variable has many zeros.</p> Signup and view all the answers

    What does the Box-Cox transformation require as input?

    <p>A transformation parameter λ and a shift parameter α.</p> Signup and view all the answers

    What is a key characteristic of the Yeo-Johnson transformation?

    <p>It can handle both positive and negative predictors.</p> Signup and view all the answers

    How should discrete predictors with many possible values be treated?

    <p>As continuous variables.</p> Signup and view all the answers

    Why is encoding nominal variables necessary?

    <p>Algorithms often require numerical features.</p> Signup and view all the answers

    What is a common issue to identify during univariate exploratory data analysis (EDA)?

    <p>High cardinality</p> Signup and view all the answers

    Which measure of dependence is specifically used for continuous variables?

    <p>Pearson correlation</p> Signup and view all the answers

    In bivariate exploratory data analysis (EDA), which aspect indicates a potential problem with model assumptions?

    <p>Non-constant error variance</p> Signup and view all the answers

    Which of the following terms describes the process of preparing data for machine learning algorithms?

    <p>Feature engineering</p> Signup and view all the answers

    What issue should be monitored during multivariate EDA?

    <p>Outliers</p> Signup and view all the answers

    Which correlation coefficient is appropriate for analyzing ordered categorical variables?

    <p>Kendall’s τ rank correlation</p> Signup and view all the answers

    Which of the following is an indicator of multicollinearity in multivariate data analysis?

    <p>Global correlation coefficients</p> Signup and view all the answers

    What does feature engineering NOT typically involve?

    <p>Gathering additional data from external sources</p> Signup and view all the answers

    What is the first step in the k-Nearest Neighbours (kNN) algorithm when making a prediction?

    <p>Finding the k training examples closest to the test input</p> Signup and view all the answers

    What does the notation Nk(x, D) represent in kNN?

    <p>The set of indexes for the nearest neighbors</p> Signup and view all the answers

    In kNN, what is the purpose of selecting the parameter k?

    <p>To average the response values for the nearest neighbors</p> Signup and view all the answers

    If k = 1 in kNN, what will the prediction be based on?

    <p>The response value of the closest training example only</p> Signup and view all the answers

    When k = 2 in kNN, how is the prediction calculated?

    <p>By averaging the response values of the two nearest neighbors</p> Signup and view all the answers

    What type of learning method does k-Nearest Neighbours represent?

    <p>Supervised learning</p> Signup and view all the answers

    In the provided example, what is the correct response predicted by kNN with k=1 for the test input with a salary of 59100?

    <p>1006</p> Signup and view all the answers

    Which of the following statements about kNN is true?

    <p>kNN stores all training examples in memory for future predictions</p> Signup and view all the answers

    Study Notes

    Market Segmentation

    • Divides a diverse consumer market into groups based on preferences, requirements, and response tendencies.
    • Aims to enhance marketing effectiveness by targeting similar responding groups.

    Matrix Factorization

    • Decomposes a matrix X into two factor matrices U and V, with dimensions n × k and p × k (k < min{n, p}).
    • Important in dimensionality reduction, simplifying datasets while preserving key information.

    Anomaly Detection

    • Identifies data points significantly different from the rest, also known as outlier detection.
    • Used in fraud detection, network intrusion detection, and ensuring manufacturing quality.

    Reinforcement Learning

    • A machine learning method where an agent makes decisions through actions in an environment, receiving rewards as feedback.
    • Aims to determine the optimal policy to maximize cumulative rewards over time.

    k-Nearest Neighbors (kNN)

    • A predictive method that uses proximity to training examples in memory to predict outcomes for test inputs.
    • The prediction for a test input is the average response of the k closest training examples.
    • Scales input data before applying the algorithm for effectiveness, using Euclidean distance for measurement.

    Linear Regression vs. k-Nearest Neighbors

    • Linear Regression: Utilizes a linear predictive function based on optimization, interpretable, quick training, generally low variance, and scales well. However, it struggles with non-linear relationships.
    • k-Nearest Neighbors: Highly flexible and can model complex relationships but is sensitive to the curse of dimensionality and slow for large datasets. It does not assume a functional form.

    Exploratory Data Analysis (EDA)

    • Univariate EDA: Focuses on data errors, missing values, outliers, skewness, kurtosis, multi-modality, and high cardinality.
    • Bivariate EDA: Examines relationships, identifies weak/strong correlations, non-linearity, and outliers.

    Feature Engineering

    • The process of preparing data for learning algorithms, crucial for project success.
    • Includes extracting, constructing, and processing features to optimize algorithm performance.

    Feature Scaling

    • Standardization or scaling can enhance model performance; robust scaling is useful in the presence of outliers.
    • Log transformations can normalize data, especially for skewed distributions.

    Transformations

    • Box-Cox Transformation: A flexible transformation that adjusts data distribution based on a parameter.
    • Yeo-Johnson Transformation: An extension of Box-Cox that accommodates both positive and negative values.

    Handling Zeros and Discrete Predictors

    • Create dummy variables or treat many zero values distinctly for more effective modeling.
    • Discrete predictors may be treated as continuous or categorical, based on their value range.

    Categorical Predictors

    • Nominal variables must be encoded numerically for use in machine learning models.

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Description

    Test your knowledge on essential machine learning concepts including market segmentation, matrix factorization, anomaly detection, reinforcement learning, and k-nearest neighbors. This quiz covers key principles and applications in a concise format to enhance your understanding of the subject.

    More Like This

    Descriptive Statistics Overview
    5 questions
    Data Analysis and Statistics Overview
    21 questions
    Use Quizgecko on...
    Browser
    Browser