Podcast
Questions and Answers
Data preprocessing helps in dealing with ______ data in the real world.
Data preprocessing helps in dealing with ______ data in the real world.
dirty
Incomplete data may have missing attribute values, lack of certain attributes of interest, or contain only aggregate data, making it ______.
Incomplete data may have missing attribute values, lack of certain attributes of interest, or contain only aggregate data, making it ______.
incomplete
No quality data, no quality mining results. Quality decisions must be based on ______ data.
No quality data, no quality mining results. Quality decisions must be based on ______ data.
quality
Data preparation, cleaning, and transformation comprises the majority of the work in a data mining application, approximately ______ %.
Data preparation, cleaning, and transformation comprises the majority of the work in a data mining application, approximately ______ %.
Signup and view all the answers
A well-accepted multi-dimensional view of data quality includes measures like accuracy, completeness, consistency, timeliness, believability, value added, ______, and accessibility.
A well-accepted multi-dimensional view of data quality includes measures like accuracy, completeness, consistency, timeliness, believability, value added, ______, and accessibility.
Signup and view all the answers
Data preprocessing helps in dealing with discrepancies, such as inconsistencies in codes or names, making the data ______.
Data preprocessing helps in dealing with discrepancies, such as inconsistencies in codes or names, making the data ______.
Signup and view all the answers
Data preprocessing involves 4 major tasks: data cleaning, data integration, normalization and aggregation, and ______ reduction
Data preprocessing involves 4 major tasks: data cleaning, data integration, normalization and aggregation, and ______ reduction
Signup and view all the answers
Data transformation includes filling in missing values, smoothing noisy data, identifying or removing outliers and noisy data, and resolving ______
Data transformation includes filling in missing values, smoothing noisy data, identifying or removing outliers and noisy data, and resolving ______
Signup and view all the answers
Data preprocessing aims to obtain reduced representation in volume but produce the same or similar analytical results. This process is known as data ______
Data preprocessing aims to obtain reduced representation in volume but produce the same or similar analytical results. This process is known as data ______
Signup and view all the answers
One of the tasks in data preprocessing is the integration of multiple databases, or files, which is known as data ______
One of the tasks in data preprocessing is the integration of multiple databases, or files, which is known as data ______
Signup and view all the answers
One of the methods used in handling missing data is to fill in missing values with a global constant, such as 'unknown', which is known as a ______ constant
One of the methods used in handling missing data is to fill in missing values with a global constant, such as 'unknown', which is known as a ______ constant
Signup and view all the answers
Another method of handling missing data is to fill in missing values with the attribute ______
Another method of handling missing data is to fill in missing values with the attribute ______
Signup and view all the answers
Noisy data may be due to random error or variance in a measured variable, which is known as ______
Noisy data may be due to random error or variance in a measured variable, which is known as ______
Signup and view all the answers
One of the problems in data preprocessing is duplicate records, incomplete data, and inconsistent data, which requires data ______
One of the problems in data preprocessing is duplicate records, incomplete data, and inconsistent data, which requires data ______
Signup and view all the answers
One of the methods used in handling noisy data is the binning method, which involves sorting data and partitioning it into (equi-depth) ______
One of the methods used in handling noisy data is the binning method, which involves sorting data and partitioning it into (equi-depth) ______
Signup and view all the answers
In the binning method for data smoothing, one can smooth by bin means, smooth by bin median, and smooth by bin ______
In the binning method for data smoothing, one can smooth by bin means, smooth by bin median, and smooth by bin ______
Signup and view all the answers