Big Data Analytics Lecture Notes PDF
Document Details
Uploaded by Deleted User
Beni-Suef University
2024
Dr.Mohamed Moustafa
Tags
Summary
These lecture notes cover Big Data Analytics, touching on class rules, course assessment, and the concept of big data itself. The document includes a discussion of data volume, velocity, and variety.
Full Transcript
10/17/2024 Big Data Analytics Dr.Mohamed Moustafa Associate Professor, Faculty of Computers and AI, Beni-Suef University MIS Consultant, ICTP, Ministr...
10/17/2024 Big Data Analytics Dr.Mohamed Moustafa Associate Professor, Faculty of Computers and AI, Beni-Suef University MIS Consultant, ICTP, Ministry of Higher Education 1 Class Rules You can do anything except: Make noises (chatting, singing…) Feel free to interrupt me if you have questions. According to the university policy,taking attendance is needed. Important: you are required to have an 80% attendance to be able to seat for the final exam. 2 2 1 10/17/2024 Course Assessment Temporary according to the situation: Final exam:50% Assignment:20%,individually Project:30%,2-3 members per group,report and presentation are required. Important:cheating and plagiarism will get no marks. 3 A few suggestions…. Your final grade is based on points – not on an accumulation of grades. You start the class with zero points and earn your way to your final grade If you have an issue or problem, communicate – send me an email If you know you’re not going to meet the deadline for a quiz or assignment – email me BEFORE the deadline 4 4 2 10/17/2024 What is Big Data? 5 Big Data, what is it? data that will not fit in main memory. traditional computer science 6 3 10/17/2024 Big Data, what is it? data that will not fit in main memory. traditional computer science For example… busy web server access logs graph of the entire Web all of Wikipedia daily satellite imagery over a year 7 Big Data, what is it? data that will not fit in main memory. traditional computer science data with a large number of observations and/or features. statistics 8 4 10/17/2024 Big Data, what is it? data that will not fit in main memory. traditional computer science data with a large number of observations and/or features. non-traditional sample size (i.e. > 100 subjects); can’t analyze in stats tools (Excel). other fields 9 What Is Big Data? The term“big data” refers to data that is so large,fast or complex that it’s difficult to process using traditional methods. 10 5 10/17/2024 Volume of Big Data Organizations collect data from a variety of sources, including business transactions smart (IoT) devices industrial equipment videos social media … In the past,storing it would have been a problem – but cheaper storage on platforms like data lakes and Hadoop have eased the burden. 11 Velocity of Big Data With the growth on the Internet of Things, data streams into businesses at an unprecedented speed and must be handled in a timely manner. Big data techniques are driving the need to deal with these torrents of data in near-real time. 12 6 10/17/2024 Variety of Big Data Data comes in all types of formats Structured:quantitative data,fits neatly within fixed fields and columns in relational databases and spreadsheets Numeric data Database … Unstructured:qualitative data,cannot be processed and analyzed using conventional tools and methods. Text documents Videos Financial transactions … 13 Big Data, what is it? Analyses which can handle the 3 Vs and doit with quality (veracity): (Laney, 2001: META Group) 1. 2. large quantity arriving quickly 3. [un]structed, multi-modal 14 7 10/17/2024 Big Data, a type of analytics ? 15 Big Data, a type of analytics 16 8 10/17/2024 Big Data, a type of analytics Data Insights! 17 D ATA INTELLIGENCE 18 9 10/17/2024 ANALYTICS C o p y r i ght © S A S I n s titu te In c. A ll r igh ts r e s e r v e d. 19 What Is Analytics The importance of big data doesn’t revolve around how much data you have,but what you do with it. Analytics is the scientific process of transforming data into insight for making better decisions,offering new opportunities for a competitive advantage. 20 20 10 10/17/2024 Types of Analytics Predictive Descriptive Prescriptive Analytics Analytics Analytics Predicting the Mining historical Enabling smart future based on data to provide decisions based historical patterns. business insights. on data. What could What has What should we happen? happened? do? 21 21 Analytics Buzzwords Machine Big data learning Business Data intelligence science Analytics Data mining 22 11 10/17/2024 Data Science Data Systems Business Machine Intelligence Learning Data Scientist Data Data Science Team Science Deep in one or two All areas covered in areas depth BusinessA Business Math nalytics Acumen or Statistics 23 Big Data Tools 24 24 12 10/17/2024 SAS No.1 market leader in analytics. The largest independent vendor in the business intelligence market. The industry standard for Clinical DataAnalysis. Integrated platform for end to end solutions. SAS provides an integrated set of software products and services and integrated technologies for information management, advanced analytics and reporting. Business solutions across domains and industries. Unmatched domain specific industry focused analytics solutions. 25 25 R R is a language and environment for statistical computing and graphics. R provides a wide variety of statistical and graphical techniques, and is highly extensible. 26 26 13 10/17/2024 Hadoop Hadoop is the most popular big data ecosystem. Hadoop is highly scalable, that is designed to accommodate computation ranging from a single server to a cluster of thousands of machines. 27 27 Python Python is an interpreted, high-level,general-purpose programming language. One of the most popular programming language in recent years. Ten areas that uses Python most frequently: Web Development Web Scraping Applications Game Development Business Applications Machine Learning andArtificial Intelligence Audio and Video Data Science and DataVisualization Applications Desktop GUI CAD Applications Embedded Applications 28 28 14 10/17/2024 Tableau Tableau is a data visualization tool that is widely used for business intelligence. Create interactive graphs and charts in the form of dashboards and worksheets to gain business insights. 29 29 15