Podcast
Questions and Answers
What percentage of data within an enterprise is estimated to be unstructured?
What percentage of data within an enterprise is estimated to be unstructured?
Which of the following is an example of semi-structured data?
Which of the following is an example of semi-structured data?
Which statement about unstructured data is true?
Which statement about unstructured data is true?
What is a characteristic of semi-structured data?
What is a characteristic of semi-structured data?
Signup and view all the answers
Which type of data typically has a faster growth rate?
Which type of data typically has a faster growth rate?
Signup and view all the answers
Which characteristic of Big Data refers to the high speed of accumulation of data?
Which characteristic of Big Data refers to the high speed of accumulation of data?
Signup and view all the answers
What does the 'Volume' characteristic of Big Data refer to?
What does the 'Volume' characteristic of Big Data refer to?
Signup and view all the answers
Which of the following is NOT a characteristic of Big Data?
Which of the following is NOT a characteristic of Big Data?
Signup and view all the answers
How is Big Data typically described in terms of the types of data it processes?
How is Big Data typically described in terms of the types of data it processes?
Signup and view all the answers
What is meant by the 'Value' characteristic of Big Data?
What is meant by the 'Value' characteristic of Big Data?
Signup and view all the answers
What type of solutions does the Data Technology sector primarily focus on?
What type of solutions does the Data Technology sector primarily focus on?
Signup and view all the answers
The term 'Big Data' generally refers to collections of data that originate from how many sources?
The term 'Big Data' generally refers to collections of data that originate from how many sources?
Signup and view all the answers
Which of the following roles is most likely involved in the big data ecosystem?
Which of the following roles is most likely involved in the big data ecosystem?
Signup and view all the answers
What happens to predictive analytics models when the underlying conditions change?
What happens to predictive analytics models when the underlying conditions change?
Signup and view all the answers
What distinguishes prescriptive analytics from predictive analytics?
What distinguishes prescriptive analytics from predictive analytics?
Signup and view all the answers
Which type of data is characterized by being stored in a relational database?
Which type of data is characterized by being stored in a relational database?
Signup and view all the answers
Which example is considered structured data?
Which example is considered structured data?
Signup and view all the answers
What type of analytics provides insights into what actions to take and why they should be taken?
What type of analytics provides insights into what actions to take and why they should be taken?
Signup and view all the answers
Which of the following is an example of machine-generated data?
Which of the following is an example of machine-generated data?
Signup and view all the answers
What is a common application of prescriptive analytics?
What is a common application of prescriptive analytics?
Signup and view all the answers
What is one of the benefits of processing Big Data?
What is one of the benefits of processing Big Data?
Signup and view all the answers
What kind of data includes social media interactions and user-generated content?
What kind of data includes social media interactions and user-generated content?
Signup and view all the answers
Which of the following accurately describes a dataset?
Which of the following accurately describes a dataset?
Signup and view all the answers
What is the main goal of data analysis?
What is the main goal of data analysis?
Signup and view all the answers
How does data analysis benefit operational decisions related to sales?
How does data analysis benefit operational decisions related to sales?
Signup and view all the answers
What could be considered a dataset?
What could be considered a dataset?
Signup and view all the answers
Which of the following is NOT one of the Five Vs of Big Data?
Which of the following is NOT one of the Five Vs of Big Data?
Signup and view all the answers
What aspect does data analysis primarily focus on?
What aspect does data analysis primarily focus on?
Signup and view all the answers
Which of the following is an example of Big Data's impact on decision-making?
Which of the following is an example of Big Data's impact on decision-making?
Signup and view all the answers
What is the primary focus of descriptive analytics?
What is the primary focus of descriptive analytics?
Signup and view all the answers
Which type of analytics seeks to answer 'why' something has occurred?
Which type of analytics seeks to answer 'why' something has occurred?
Signup and view all the answers
What is the correct order of analytics from least complex to most complex?
What is the correct order of analytics from least complex to most complex?
Signup and view all the answers
Which of the following would be a sample question for diagnostic analytics?
Which of the following would be a sample question for diagnostic analytics?
Signup and view all the answers
What does prescriptive analytics aim to achieve?
What does prescriptive analytics aim to achieve?
Signup and view all the answers
What is a primary activity of predictive analytics?
What is a primary activity of predictive analytics?
Signup and view all the answers
What type of analysis is suitable for performing drill down and roll-up analysis?
What type of analysis is suitable for performing drill down and roll-up analysis?
Signup and view all the answers
Which of the following best describes data analytics?
Which of the following best describes data analytics?
Signup and view all the answers
Study Notes
DSC650: Data Technology and Future Emergence
- This course, DSC650, examines data technology and its future implications.
- The first lecture (Lecture 1) focuses on a general overview of data technology.
- The learning objective (CLO1) is for students to grasp fundamental concepts and practices in big data technology.
1.1 Overview of Data Technology
- The overview covers data technology evolution.
- It details an introduction to big data.
- The lecture explores the big data ecosystem.
- The foundation of big data technology is also explained.
- Related career outlook is discussed.
Data Technology
- Data technology (Data Tech) encompasses technologies associated with areas like martech and adtech.
- Data Tech includes solutions for data management.
- It involves products and services based on data generated by people and machines.
- These technologies are used to manage large datasets, create data management solutions, and collect data from various sources for business insights.
Data Technology Evolution
- The diagram shows the evolution of data technologies: relational databases, traditional DBMS's, object-oriented and object-relational databases, NoSQL (Big Data), digital technologies, and intelligent DBMS's.
- The evolution describes the progression from traditional database systems to more advanced big data solutions.
- The diagram shows links between these technologies, suggesting their interrelationship in modern data handling.
Big Data - An Introduction
- Big Data is defined as the analysis, processing, and storage of large datasets.
- Datasets often originate from various sources.
- Data encompasses multiple unrelated datasets.
- Processing involves large amounts of unstructured data.
- The processing is time-sensitive and aims to extract hidden information.
Big Data - Characteristics (5V)
- Volume: Huge amount of data. Large volume signifies big data.
- Velocity: High speed data accumulation, continuous data flow.
- Variety: Data nature. Data is structured, semi-structured, and unstructured. Data sources are diverse.
- Veracity: Data inconsistencies and uncertainties. Dealing with the variability of data.
- Value: Extract valuable knowledge from the data. The data needs to be useful/valuable.
1.2 Big Data - An Introduction
- Big Data processing yields significant insights and benefits.
- These benefits include operational optimization, actionable intelligence, identifying new markets, accurate predictions, fault and fraud detection, detailed records, improved decision-making, and scientific discoveries.
Big Data Terminology: Datasets
- Datasets are collections of related data.
- Each data point within a dataset shares the same attributes or properties.
- Examples include tweets (in a flat file), image files (in a directory), database table extracts (in CSV format), and historical weather data (in XML format).
Big Data Terminology: Data Analysis
- Data analysis is the process of examining data to identify facts, relationships, patterns, insights, and trends.
- The overall objective of data analysis is to support better decision-making.
- An example application of data analysis is analyzing ice cream sales data to determine sales volume related to daily temperature.
- Real-world data analysis helps establish patterns and relationships in the data being examined.
Big Data Terminology: Data Analytics
- Data analytics is an expanded term for data analysis.
- It encompasses the complete lifecycle of data, including collecting, cleansing, organizing, storing, analyzing, and governing data.
- Data analytics describes the scope of comprehensive data management.
Four General Categories of Analytics
- Descriptive: Summarizes past data.
- Diagnostic: Examines the cause and reason behind past events.
- Predictive: Makes estimations regarding future events using past data patterns.
- Prescriptive: Provides recommendations and optimal actions based on predictive analysis.
Data Analytics: Descriptive Analytics
- Examines past events to answer specific questions.
- Summarizes/contextualizes data to produce insights.
- Example questions include sales volume over the past year, support calls by severity/location, or monthly commissions by sales agent.
Descriptive Analytics Tools
- Operational systems (like OLTP, CRM, ERP) are used with descriptive analytics tools.
- Reports and dashboards are created from these systems to visualize data.
Data Analytics: Diagnostic Analytics
- Aims to determine the cause of past events.
- Analyzes reasons behind observed events.
- Example questions include the factors behind reduced Q2 sales compared to Q1, support calls increasing in a particular region, or the reasons for a rise in patient readmissions.
- The analysis uncovers the reasons why the phenomenon occurred.
Data Analytics: Predictive Analytics
- Predicts future outcomes based on past events.
- Enhances meaning from information to understand relationships.
- Models used in predictive analytics rely on past situations' conditions.
- Models need adjustments if the underlying conditions change.
- Example questions include predicting loan defaults, patient survival rates, or whether a customer will purchase a product.
Data Analytics: Prescriptive Analytics
- Builds on predictive results to suggest actions.
- Determines best actions to take.
- Offers insights based on potential scenarios and risk mitigation.
- Examines various results and potential factors.
- Example questions include the best drug among options, the best time to trade a stock, and optimal risk mitigation strategies.
Big Data Usages
- Data usage across industries reveals notable applications of these technologies.
- The chart visually displays industry-specific usage rates.
- These usage rates suggest big data's extensive application across multiple sectors.
Types of Data: Structured Data
- Conforms to data models or schemas, typically stored in tabular form.
- Used to record relationships between entities.
- Commonly found in relational databases used by enterprise applications such as ERP and CRM systems.
- Examples include banking transactions, invoices, and customer details.
Types of Data: Unstructured Data
- Does not adhere to data models or schemas.
- Represents the greater part (estimated 80%) of data in a company.
- Has a faster growth rate than structured data.
- Often textual or binary data formats like text files, images, audio, and video.
- The classification depends on data format, not its actual content.
Types of Data: Semi-structured Data
- Contains a degree of structure and consistency but lacks full relational format.
- Commonly stored in hierarchical or graph-based formats.
- Examples include XML and JSON files, EDI files, spreadsheets, and sensor data.
- Semi-structured data is easier to process than unstructured data due to its structural elements.
Big Data Ecosystem
- This diagram illustrates the ecosystem.
- Data sources are detailed in the figure, examples included SAP, PeopleSoft, etc.
- Ingestion methods (e.g., MQ Series, Informatica) and storage methods (e.g., EDW, OLAP, HDFS) are illustrated.
- Exploration methods and consumption methods (e.g., custom solutions, parameterised reports, dashboards) are shown..
Big Data Architecture - Technology Foundation
- Displays the layers of a big data architecture, starting from internet feeds and applications.
- Different kinds of databases (structured, unstructured, semi-structured) are used at the operational base.
- The architecture contains features like interfaces, security systems, and redundant physical infrastructure.
Big Data Career Path
- This section displays various big data job titles and their expected salary ranges.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
This quiz focuses on the first lecture of DSC650, which provides an overview of data technology. It covers the evolution of data technology, big data introduction, and associated career outlook. Explore the foundational concepts and practices that shape the future of big data technology.