ITBAN 3 - Fundamentals of Analytics Modelling Data Preprocessing

InexpensiveOnyx2032 avatar
InexpensiveOnyx2032
·
·
Download

Start Quiz

Study Flashcards

28 Questions

Data preprocessing is not necessary before data analysis.

False

Incomplete data with lacking attribute values does not affect data quality.

False

Noise in data refers to errors or outliers.

True

Data inconsistency is not a concern in data preprocessing.

False

Quality decisions can be made based on poor quality data.

False

Data preprocessing involves tasks such as filling in missing values and identifying outliers.

True

Data transformation in preprocessing involves randomizing the data for better analysis results.

False

Data reduction in preprocessing aims to increase the volume of data for more accurate analytical results.

False

Missing data in a dataset can occur due to equipment malfunction or intentional removal of valuable information.

True

Data preprocessing may involve inferring missing data based on the available information.

True

Handling missing data is not a crucial step in the data preprocessing process.

False

Equal-width partitioning divides the range into N intervals of different sizes.

False

Equal-depth partitioning divides the range into N intervals with different sample quantities.

False

Binning methods for data smoothing involve sorting data only.

False

Linear regression is used to fit data into regression functions.

True

Cluster analysis involves detecting and removing outliers.

False

Semi-automated methods combine computer and human inspection to only detect suspicious values.

False

Ignoring the tuple is always an effective method when the class label is missing in a classification task.

False

Filling in missing values manually is a quick and efficient process.

False

Using a global constant like 'unknown' to fill in missing values introduces a new class.

True

Filling in missing values with the attribute mean improves data quality.

True

Using the attribute mean for all samples of the same class to fill in missing values is not a smarter approach.

False

Noise in data is usually caused by consistent and accurate measurements.

False

Data cleaning is one of the steps involved in preprocessing the data.

True

Data integration involves combining data from a single source into a coherent store.

False

Schema integration in data preprocessing refers to resolving conflicts between different data types.

False

Detecting and resolving data value conflicts is a part of the data integration process.

True

Incomplete data with lacking attribute values does not impact data quality during preprocessing.

False

Test your knowledge on the fundamentals of data preprocessing in analytics modelling based on notes by Jiawei Han and Micheline Kamber. The quiz covers topics such as data cleaning, integration, transformation, reduction, discretization, and concept hierarchy generation.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free
Use Quizgecko on...
Browser
Browser