CIS 4093 Chapter 5 Flashcards
40 Questions
100 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

In the Cabela's case study, the SAS/Teradata solution enabled the direct marketer to better identify likely customers and market to them based mostly on external data sources.

False

The cost of data storage has plummeted recently, making data mining feasible for more firms.

True

Data mining can be very useful in detecting patterns such as credit card fraud, but is of little help in improving sales.

False

The entire focus of the predictive analytics system in the Infinity P&C case was on detecting and handling fraudulent claims for the company's benefit.

<p>False</p> Signup and view all the answers

If using a mining analogy, 'knowledge mining' would be a more appropriate term than 'data mining.'

<p>True</p> Signup and view all the answers

Data mining requires specialized data analysts to ask ad hoc questions and obtain answers quickly from the system.

<p>False</p> Signup and view all the answers

Ratio data is a type of categorical data.

<p>False</p> Signup and view all the answers

Interval data is a type of numerical data.

<p>True</p> Signup and view all the answers

In the Memphis Police Department case study, predictive analytics helped to identify the best schedule for officers in order to pay the least overtime.

<p>False</p> Signup and view all the answers

In data mining, classification models help in prediction.

<p>True</p> Signup and view all the answers

Statistics and data mining both look for data sets that are as large as possible.

<p>False</p> Signup and view all the answers

Using data mining on data about imports and exports can help to detect tax avoidance and money laundering.

<p>True</p> Signup and view all the answers

In the cancer research case study, data mining algorithms that predict cancer survivability with high predictive power are good replacements for medical professionals.

<p>False</p> Signup and view all the answers

During classification in data mining, a false positive is an occurrence classified as true by the algorithm while being false in reality.

<p>True</p> Signup and view all the answers

When training a data mining model, the testing dataset is always larger than the training dataset.

<p>False</p> Signup and view all the answers

When a problem has many attributes that impact the classification of different patterns, decision trees may be a useful approach.

<p>True</p> Signup and view all the answers

In the 2degrees case study, the main effectiveness of the new analytics system was in dissuading potential churners from leaving the company.

<p>True</p> Signup and view all the answers

Market basket analysis is a useful and entertaining way to explain data mining to a technologically less savvy audience, but it has little business significance.

<p>False</p> Signup and view all the answers

The number of users of free/open source data mining software now exceeds that of users of commercial software versions.

<p>True</p> Signup and view all the answers

Data that is collected, stored, and analyzed in data mining is often private and personal. There is no way to maintain individuals' privacy other than being very careful about physical data security.

<p>False</p> Signup and view all the answers

In the Cabela's case study, what types of models helped the company understand the value of customers, using a five-point scale?

<p>Clustering and association models</p> Signup and view all the answers

Understanding customers better has helped Amazon and others become more successful. The understanding comes primarily from

<p>Analyzing the vast data amounts routinely collected.</p> Signup and view all the answers

All of the following statements about data mining are true EXCEPT

<p>The process aspect means that data mining should be a one-step process to results.</p> Signup and view all the answers

What is the main reason parallel processing is sometimes used for data mining?

<p>Because of the massive data amounts and search efforts involved.</p> Signup and view all the answers

The data field 'ethnic group' can be best described as

<p>Nominal data.</p> Signup and view all the answers

The data field 'salary' can be best described as

<p>Ratio data.</p> Signup and view all the answers

Which broad area of data mining applications analyzes data, forming rules to distinguish between defined classes?

<p>Classification</p> Signup and view all the answers

Which broad area of data mining applications partitions a collection of objects into natural groupings with similar features?

<p>Clustering</p> Signup and view all the answers

The data mining algorithm type used for classification somewhat resembling the biological neural networks in the human brain is

<p>Artificial neural networks.</p> Signup and view all the answers

Identifying and preventing incorrect claim payments and fraudulent activities falls under which type of data mining applications?

<p>Insurance</p> Signup and view all the answers

All of the following statements about data mining are true EXCEPT

<p>Building the model takes the most time and effort.</p> Signup and view all the answers

Which data mining process/methodology is thought to be the most comprehensive, according to kdnuggets.com rankings?

<p>CRISP-DM</p> Signup and view all the answers

Prediction problems where the variables have numeric values are most accurately defined as

<p>Regressions.</p> Signup and view all the answers

What does the robustness of a data mining method refer to?

<p>Its ability to overcome noisy data to make somewhat accurate predictions.</p> Signup and view all the answers

What does the scalability of a data mining method refer to?

<p>Its ability to construct a prediction model efficiently given a large amount of data.</p> Signup and view all the answers

In estimating the accuracy of data mining (or other) classification models, the true positive rate is

<p>The ratio of correctly classified positives divided by the total positive count.</p> Signup and view all the answers

In data mining, finding an affinity of two products to be commonly together in a shopping cart is known as

<p>Association rule mining.</p> Signup and view all the answers

Third party providers of publicly available datasets protect the anonymity of the individuals in the data set primarily by

<p>Removing identifiers such as names and social security numbers.</p> Signup and view all the answers

In the Target case study, why did Target send a teen maternity ads?

<p>Target's analytic model suggested she was pregnant based on her buying habits.</p> Signup and view all the answers

Which of the following is a data mining myth?

<p>Data mining requires a separate, dedicated database.</p> Signup and view all the answers

Study Notes

Data Mining and Predictive Analytics

  • Cabela's utilized SAS/Teradata for enhanced customer identification through external data sources, but this statement is false.
  • The cost of data storage has significantly reduced, facilitating data mining accessibility for more companies.
  • Data mining effectively detects patterns like credit card fraud and can also enhance sales strategies.
  • In the Infinity P&C case, predictive analytics focused more broadly than just fraudulent claims.

Key Concepts in Data Mining

  • "Knowledge mining" is a more appropriate term than "data mining," highlighting the focus on extracting useful information.
  • Data mining does not necessarily require specialized data analysts for ad hoc inquiries.
  • Ratio data is distinct from categorical data types, contradicting this misunderstanding.
  • Interval data is classified as numerical data, which aids in various analytical processes.

Applications and Effectiveness of Predictive Analytics

  • Memphis Police Department didn't solely leverage predictive analytics for optimizing officer schedules.
  • Classification models play a vital role in predictive analytics by aiding in accurate forecasting.
  • Both statistics and data mining do not prioritize maximizing data set sizes equally.
  • Mining import/export data can be instrumental in uncovering tax evasion and money laundering activities.
  • Data collection and analysis methods can retain individual privacy if handled correctly.

Models and Methodologies

  • Cabela's applied clustering and association models to evaluate customer value using a five-point system.
  • Amazon's success is closely tied to analyzing extensive data sets on customer behavior.
  • Effective data mining involves multiple steps including understanding business needs and relevant data variables.

Data Mining Processes and Techniques

  • CRISP-DM methodology is recognized as one of the most comprehensive data mining frameworks.
  • Regression techniques are typically employed for prediction issues involving numeric variables.

Robustness and Scalability

  • A robust data mining method successfully handles noisy data for accurate predictions.
  • Scalability refers to efficiently constructing prediction models with extensive datasets.

Classification Accuracy and Analysis

  • True positive rate is defined as the ratio of correctly classified positives to total positive counts, crucial for evaluating model performance.
  • Association rule mining identifies product affinities found in consumer purchasing patterns.

Data Privacy and Ethics in Data Mining

  • Third-party data providers enhance individual anonymity by removing identifiers like names from public datasets.

Notable Case Studies

  • Target utilized a predictive model indicating a teen's pregnancy based on her buying patterns, leading to targeted advertising.
  • Misconceptions persist regarding data mining requiring dedicated databases, which is inaccurate in modern practices.

General Misconceptions

  • Data mining is a deliberate, multistep process, contrary to myths suggesting simplicity and immediate usability for all businesses.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Description

Test your knowledge with these flashcards from CIS 4093 Chapter 5. This quiz covers key concepts related to data mining, the impact of recent technological advances, and applications in marketing strategies. Challenge yourself and see how well you understand the material.

More Like This

Use Quizgecko on...
Browser
Browser