Machine Learning Fundamentals

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What percentage of missing values were present in the categorical features of the AI dataset?

  • 40%
  • 20%
  • 10%
  • 30% (correct)

Which feature scaling method was used for the Titanic dataset?

  • StandardScaler
  • KernelScaler
  • Min-Max Scaler (correct)
  • Normalizer

What is the alpha value used in Ridge Regression for the Titanic dataset?

  • 0.1
  • 0.05
  • 0.5 (correct)
  • 0.01

What is the F1-score for the Logistic Regression model on the AI dataset?

<p>0.82 (D)</p> Signup and view all the answers

What is the accuracy of the Random Forest model on the Titanic dataset?

<p>0.85 (B)</p> Signup and view all the answers

Flashcards are hidden until you start studying

Study Notes

Data Preprocessing

  • Handling Missing Values:
    • AI dataset: 30% missing values in categorical features, imputed using K-Nearest Neighbors (KNN) imputation
    • Titanic dataset: 20% missing values in age feature, imputed using median imputation
    • BostonHousing dataset: no missing values
  • Feature Scaling:
    • AI dataset: standardized using StandardScaler
    • Titanic dataset: normalized using Min-Max Scaler
    • BostonHousing dataset: standardized using StandardScaler
  • Feature Selection:
    • AI dataset: selected top 10 features using mutual information
    • Titanic dataset: selected top 5 features using recursive feature elimination
    • BostonHousing dataset: selected top 8 features using recursive feature elimination

Regression Models

  • Simple Linear Regression:
    • AI dataset: R² = 0.75, mean squared error (MSE) = 10.5
    • Titanic dataset: R² = 0.42, MSE = 15.1
    • BostonHousing dataset: R² = 0.85, MSE = 5.2
  • Multiple Linear Regression:
    • AI dataset: R² = 0.85, MSE = 9.2
    • Titanic dataset: R² = 0.50, MSE = 13.5
    • BostonHousing dataset: R² = 0.92, MSE = 4.5
  • Ridge Regression:
    • AI dataset: alpha = 0.1, R² = 0.82, MSE = 10.2
    • Titanic dataset: alpha = 0.5, R² = 0.45, MSE = 14.2
    • BostonHousing dataset: alpha = 0.01, R² = 0.90, MSE = 4.8

Classification Algorithms

  • Logistic Regression:
    • AI dataset: accuracy = 0.80, F1-score = 0.82
    • Titanic dataset: accuracy = 0.75, F1-score = 0.78
  • Decision Trees:
    • AI dataset: accuracy = 0.85, F1-score = 0.88
    • Titanic dataset: accuracy = 0.80, F1-score = 0.82
  • Random Forest:
    • AI dataset: accuracy = 0.90, F1-score = 0.92
    • Titanic dataset: accuracy = 0.85, F1-score = 0.88

Questions

  • How do you handle missing values in a dataset?
  • What is the purpose of feature scaling in machine learning?
  • How do you select the most important features in a dataset?
  • What is the difference between simple linear regression and multiple linear regression?
  • How do you regularize a linear regression model?
  • What is the difference between logistic regression and decision trees in classification problems?
  • How do you evaluate the performance of a classification model?
  • What is the role of hyperparameter tuning in machine learning?
  • How do you handle class imbalance in a classification problem?
  • What is the difference between precision and recall in classification evaluation metrics?

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

More Like This

Use Quizgecko on...
Browser
Browser