Feature Engineering Techniques PDF

Document Details

MeritoriousDoppelganger

Uploaded by MeritoriousDoppelganger

Northeastern University

TS. Lương Văn Thiện

Tags

feature engineering machine learning data preprocessing data analysis

Summary

This document provides an overview of various feature engineering techniques in machine learning. It covers topics such as data imputation, normalization, one-hot encoding, time series analysis, NLP, outlier handling, feature selection, and dimensionality reduction. The document also includes examples and visualizations.

Full Transcript

Các kỹ thuật Feature Engineering TS. Lương Văn Thiện Business AI Lab, NEU www.tvluong.wordpress.com 1 Feature Engineering Data imputation Data normalization One-hot encoding Feature engineering in time-series, NLP Data dimen...

Các kỹ thuật Feature Engineering TS. Lương Văn Thiện Business AI Lab, NEU www.tvluong.wordpress.com 1 Feature Engineering Data imputation Data normalization One-hot encoding Feature engineering in time-series, NLP Data dimensionality reduction 2 Mô hình chung cho các bài toán machine learning 3 Data imputation Next or Previous Value K Nearest Neighbors Maximum or Minimum Value Missing Value Prediction Most Frequent Value Average or Linear Interpolation (Rounded) Mean or Moving Average or Median Value Fixed Value https://towardsdatascience.com/6-different-ways-to-compensate-for-missing-values-data-imputation-with-examples-6022d9ca0779 4 Chuẩn hóa dữ liệu 5 Chuẩn hóa dữ liệu 6 Chuẩn hóa dữ liệu One-hot encoding Log transform 7 Handling Outliers Outlier detection Remove outliers Transform outliers: log transform Impute outliers: mean, median, mode, nearest neighbor 8 Feature Engineering in Time Series 9 Feature Engineering in Time Series Seasonal-Trend decomposition using LOESS (STL) 10 Feature Engineering in Time Series STL decomposition 11 https://timeseriesreasoning.com/contents/time-series-decomposition/ Feature engineering in NLP Bag of words Term Frequency-Inverse Document Frequency (TF- IDF) Word2vec 12 Tần suất tiếng trong Truyện Kiều https://machinelearningcoban.com/general/2017/02 /06/featureengineering/ 13 Feature selection 14 Data dimensionality reduction PCA LDA Autoencoder 15

Use Quizgecko on...
Browser
Browser