ITBAN 3 – Fundamentals of Analytics Modelling Data Preprocessing Quiz

ITBAN 3 – Fundamentals of Analytics Modelling Data Preprocessing Quiz

Created by
@StatelyRational

Questions and Answers

What is one of the major tasks in data preprocessing?

Data cleaning

Which task involves scaling data to a specific range during data preprocessing?

Data transformation

Why might missing data occur in a dataset?

Equipment malfunction

Which preprocessing task aims to reduce the volume of data while maintaining analytical results?

<p>Data reduction</p> Signup and view all the answers

What is a common reason for missing data in sales records according to the text?

<p>Misunderstanding of data importance</p> Signup and view all the answers

Which step of data preprocessing involves integrating multiple databases or files?

<p>Data integration</p> Signup and view all the answers

Which of the following is the most effective way to handle a missing class label in a classification task?

<p>Ignore the tuple</p> Signup and view all the answers

What is the definition of noise in the context of data?

<p>Random error in a measured variable</p> Signup and view all the answers

Which of the following is not a common cause of incorrect attribute values in data?

<p>Inconsistency in variable naming</p> Signup and view all the answers

What is the purpose of the binning method in handling noisy data?

<p>To smooth the data by replacing values with bin means or medians</p> Signup and view all the answers

Which of the following is a more sophisticated approach to filling in missing values compared to using a global constant or attribute mean?

<p>Using the most probable value based on inference</p> Signup and view all the answers

Which of the following is not a common data quality issue that requires data cleaning?

<p>Unstructured text data</p> Signup and view all the answers

What is the main purpose of using external references for manual correction in data preprocessing?

<p>To correct redundant data</p> Signup and view all the answers

What is the primary goal of data integration?

<p>Schema integration</p> Signup and view all the answers

What is the entity identification problem in data integration?

<p>Identifying real world entities from multiple data sources</p> Signup and view all the answers

Why might attribute values for the same real world entity be different in data integration?

<p>Due to different scales used in the data</p> Signup and view all the answers

What is the purpose of resolving data value conflicts in data integration?

<p>To ensure consistency when attribute values differ for the same entity</p> Signup and view all the answers

What is the purpose of clustering in data preprocessing?

<p>Detect and remove outliers</p> Signup and view all the answers

Which method divides the range into N intervals of equal size in data discretization?

<p>Equal-width partitioning</p> Signup and view all the answers

What issue might arise when using equal-width partitioning for discretization?

<p>Outliers dominating presentation</p> Signup and view all the answers

Which method involves dividing the range into intervals with approximately the same number of samples?

<p>Equal-depth partitioning</p> Signup and view all the answers

What does smoothing by bin means entail?

<p>Finding the average value in each bin</p> Signup and view all the answers

In what scenario would smoothing by bin boundaries be preferred?

<p>Maintaining original range information</p> Signup and view all the answers

What is the primary reason for data preprocessing?

<p>To improve the quality of the data</p> Signup and view all the answers

Which of the following is NOT a key aspect of data quality according to the passage?

<p>Scalability</p> Signup and view all the answers

What is the relationship between data quality and mining results according to the passage?

<p>Data quality and mining results are directly related</p> Signup and view all the answers

Which of the following is an example of a data quality issue mentioned in the passage?

<p>Data is incomplete</p> Signup and view all the answers

What is the primary purpose of a data warehouse according to the passage?

<p>To integrate and consolidate quality data</p> Signup and view all the answers

Use Quizgecko on...
Browser
Browser