Machine Learning Fundamentals

StrongPoplar avatar
StrongPoplar
·
·
Download

Start Quiz

Study Flashcards

5 Questions

What percentage of missing values were present in the categorical features of the AI dataset?

30%

Which feature scaling method was used for the Titanic dataset?

Min-Max Scaler

What is the alpha value used in Ridge Regression for the Titanic dataset?

0.5

What is the F1-score for the Logistic Regression model on the AI dataset?

0.82

What is the accuracy of the Random Forest model on the Titanic dataset?

0.85

Study Notes

Data Preprocessing

  • Handling Missing Values:
    • AI dataset: 30% missing values in categorical features, imputed using K-Nearest Neighbors (KNN) imputation
    • Titanic dataset: 20% missing values in age feature, imputed using median imputation
    • BostonHousing dataset: no missing values
  • Feature Scaling:
    • AI dataset: standardized using StandardScaler
    • Titanic dataset: normalized using Min-Max Scaler
    • BostonHousing dataset: standardized using StandardScaler
  • Feature Selection:
    • AI dataset: selected top 10 features using mutual information
    • Titanic dataset: selected top 5 features using recursive feature elimination
    • BostonHousing dataset: selected top 8 features using recursive feature elimination

Regression Models

  • Simple Linear Regression:
    • AI dataset: R² = 0.75, mean squared error (MSE) = 10.5
    • Titanic dataset: R² = 0.42, MSE = 15.1
    • BostonHousing dataset: R² = 0.85, MSE = 5.2
  • Multiple Linear Regression:
    • AI dataset: R² = 0.85, MSE = 9.2
    • Titanic dataset: R² = 0.50, MSE = 13.5
    • BostonHousing dataset: R² = 0.92, MSE = 4.5
  • Ridge Regression:
    • AI dataset: alpha = 0.1, R² = 0.82, MSE = 10.2
    • Titanic dataset: alpha = 0.5, R² = 0.45, MSE = 14.2
    • BostonHousing dataset: alpha = 0.01, R² = 0.90, MSE = 4.8

Classification Algorithms

  • Logistic Regression:
    • AI dataset: accuracy = 0.80, F1-score = 0.82
    • Titanic dataset: accuracy = 0.75, F1-score = 0.78
  • Decision Trees:
    • AI dataset: accuracy = 0.85, F1-score = 0.88
    • Titanic dataset: accuracy = 0.80, F1-score = 0.82
  • Random Forest:
    • AI dataset: accuracy = 0.90, F1-score = 0.92
    • Titanic dataset: accuracy = 0.85, F1-score = 0.88

Questions

  • How do you handle missing values in a dataset?
  • What is the purpose of feature scaling in machine learning?
  • How do you select the most important features in a dataset?
  • What is the difference between simple linear regression and multiple linear regression?
  • How do you regularize a linear regression model?
  • What is the difference between logistic regression and decision trees in classification problems?
  • How do you evaluate the performance of a classification model?
  • What is the role of hyperparameter tuning in machine learning?
  • How do you handle class imbalance in a classification problem?
  • What is the difference between precision and recall in classification evaluation metrics?

Test your knowledge of machine learning concepts, including data preprocessing, regression models, classification algorithms, and model evaluation metrics. Covering topics such as handling missing values, feature scaling, and hyperparameter tuning.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free
Use Quizgecko on...
Browser
Browser