Predictive Analytics: Big Data in Data Mining (PDF)

Document Details

CooperativeUnity5548

Uploaded by CooperativeUnity5548

DURSUN DELEN

Tags

big data data mining predictive analytics

Summary

This document is a presentation on predictive analytics, focusing on applications to big data. It explores multiple facets of big data, analytics in political campaigns, and the skills needed by a data scientist. This material is intended for instructors and students.

Full Transcript

Chapter 8 Big Data for Predictive Analytics Predictive Analytics: Data Mining, Machine Learning & Data Science for...

Chapter 8 Big Data for Predictive Analytics Predictive Analytics: Data Mining, Machine Learning & Data Science for 1 Practitioners The Vs That Define Big Data  Is Big Data a misnomer?  The reality: Big Data is more than just “big”  The words that start with “V” to define Big Data Volume Variety Velocity Veracity Variability Value … Predictive Analytics: Data Mining, Machine Learning & Data Science for 2 Practitioners “Volume” of Big Data  Hadron collider - 1 PB/sec  Boeing jet - 20 TB/hr  Facebook - 500 TB/day  YouTube – 1 TB/4 min  … Predictive Analytics: Data Mining, Machine Learning & Data Science for 3 Practitioners Fundamental Concepts of Big Data  Big Data by itself, regardless of the size, type, or speed, is worthless  Big Data + “big” analytics = value  With the value proposition, Big Data also brought about big challenges Effectively and efficiently capturing, storing, and analyzing Big Data New breed of technologies needed (developed or purchased or hired or outsourced …) New type of talents/skills needed (under the title of “Data Scientist”) Predictive Analytics: Data Mining, Machine Learning & Data Science for 4 Practitioners When to Consider Big Data?  You can’t process the amount of data that you want because of the limitations of your current platform  You can’t include new data sources (e.g., social media, RFID, Sensory, Web, GPS, textual data) because it does not comply with the data storage schema  You need to (or want to) integrate data as quickly as possible to be current on your analysis.  You want to work with a schema-on-demand data storage paradigm because of the variety of data types involved  The data is arriving so fast at your organization that your traditional analytics platform cannot handle it Predictive Analytics: Data Mining, Machine Learning & Data Science for 5 Practitioners Critical Success Factors for Big Data  These success factors are the same as the ones listed for previous large-scale IS projects (e.g., ERP, Data Warehouse, etc.) Predictive Analytics: Data Mining, Machine Learning & Data Science for 6 Practitioners Skills That Define a Data Scientist Soft Skills Hard Skills Predictive Analytics: Data Mining, Machine Learning & Data Science for 7 Practitioners Big Data and Stream Analytics A Use Case for Stream Analytics in Energy Industry Predictive Analytics: Data Mining, Machine Learning & Data Science for 8 Practitioners Application Case: Big Data for Political Campaigns  One of the most publicized application areas for big data analytics  Experiences from recent presidential elections have illustrated the power of big data and analytics to acquire and energize millions of volunteers (modern- era grassroots movement)  It can be used for predicting the election outcomes to targeting potential voters and donors, big data and analytics have a lot to offer to modern election campaigns Predictive Analytics: Data Mining, Machine Learning & Data Science for 9 Practitioners Application Case: Big Data for Political Campaigns Predictive Analytics: Data Mining, Machine Learning & Data Science for 10 Practitioners End of Chapter 8  Question?  Comments… Predictive Analytics: Data Mining, Machine Learning & Data Science for 11 Practitioners Copyright This work is protected by United States copyright laws and is provided solely for the use of instructors in teaching their courses and assessing student learning. Dissemination or sale of any part of this work (including on the World Wide Web) will destroy the integrity of the work and is not permitted. The work and materials from it should never be made available to students except by instructors using the accompanying text in their classes. All recipients of this work are expected to abide by these restrictions and to honor the intended pedagogical purposes and the needs of other instructors who rely on these materials. Predictive Analytics: Data Mining, Machine Learning & Data Science for 12 Practitioners

Use Quizgecko on...
Browser
Browser