DSC650: Data Technology Overview
37 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What percentage of data within an enterprise is estimated to be unstructured?

  • 70%
  • 90%
  • 80% (correct)
  • 50%
  • Which of the following is an example of semi-structured data?

  • Text file of tweets
  • Video file
  • Audio file
  • JSON file (correct)
  • Which statement about unstructured data is true?

  • It only contains textual data.
  • It is easily processed compared to structured data.
  • It has a defined schema.
  • It consists of self-contained files that are non-relational. (correct)
  • What is a characteristic of semi-structured data?

    <p>It is hierarchical or graph-based. (D)</p> Signup and view all the answers

    Which type of data typically has a faster growth rate?

    <p>Unstructured data (D)</p> Signup and view all the answers

    Which characteristic of Big Data refers to the high speed of accumulation of data?

    <p>Velocity (B)</p> Signup and view all the answers

    What does the 'Volume' characteristic of Big Data refer to?

    <p>The large amount of data (D)</p> Signup and view all the answers

    Which of the following is NOT a characteristic of Big Data?

    <p>Simplicity (D)</p> Signup and view all the answers

    How is Big Data typically described in terms of the types of data it processes?

    <p>Structured, semi-structured, and unstructured data (A)</p> Signup and view all the answers

    What is meant by the 'Value' characteristic of Big Data?

    <p>It must be turned into something valuable for insights (D)</p> Signup and view all the answers

    What type of solutions does the Data Technology sector primarily focus on?

    <p>Data management solutions and services (D)</p> Signup and view all the answers

    The term 'Big Data' generally refers to collections of data that originate from how many sources?

    <p>Multiple, often unrelated sources (A)</p> Signup and view all the answers

    Which of the following roles is most likely involved in the big data ecosystem?

    <p>Data Analyst (D)</p> Signup and view all the answers

    What happens to predictive analytics models when the underlying conditions change?

    <p>They need to be updated to remain effective. (C)</p> Signup and view all the answers

    What distinguishes prescriptive analytics from predictive analytics?

    <p>Prescriptive analytics prescribes actions based on reasons. (A)</p> Signup and view all the answers

    Which type of data is characterized by being stored in a relational database?

    <p>Structured data. (D)</p> Signup and view all the answers

    Which example is considered structured data?

    <p>Banking transactions. (C)</p> Signup and view all the answers

    What type of analytics provides insights into what actions to take and why they should be taken?

    <p>Prescriptive analytics. (D)</p> Signup and view all the answers

    Which of the following is an example of machine-generated data?

    <p>Web logs. (D)</p> Signup and view all the answers

    What is a common application of prescriptive analytics?

    <p>Determining the best selling price for a product. (A)</p> Signup and view all the answers

    What is one of the benefits of processing Big Data?

    <p>Operational optimization (A)</p> Signup and view all the answers

    What kind of data includes social media interactions and user-generated content?

    <p>Unstructured data. (B)</p> Signup and view all the answers

    Which of the following accurately describes a dataset?

    <p>A collection where each member shares the same attributes (B)</p> Signup and view all the answers

    What is the main goal of data analysis?

    <p>To support better decision making (A)</p> Signup and view all the answers

    How does data analysis benefit operational decisions related to sales?

    <p>By linking sales data to trends like temperature (C)</p> Signup and view all the answers

    What could be considered a dataset?

    <p>An extract of rows from a database in CSV format (D)</p> Signup and view all the answers

    Which of the following is NOT one of the Five Vs of Big Data?

    <p>Verification (B)</p> Signup and view all the answers

    What aspect does data analysis primarily focus on?

    <p>Finding facts, relationships, patterns, insights and trends (D)</p> Signup and view all the answers

    Which of the following is an example of Big Data's impact on decision-making?

    <p>Using predictive analytics to optimize ice-cream orders (C)</p> Signup and view all the answers

    What is the primary focus of descriptive analytics?

    <p>To understand past events. (D)</p> Signup and view all the answers

    Which type of analytics seeks to answer 'why' something has occurred?

    <p>Diagnostic analytics (D)</p> Signup and view all the answers

    What is the correct order of analytics from least complex to most complex?

    <p>Descriptive, Diagnostic, Predictive, Prescriptive (D)</p> Signup and view all the answers

    Which of the following would be a sample question for diagnostic analytics?

    <p>Why have sales declined in Q2? (C)</p> Signup and view all the answers

    What does prescriptive analytics aim to achieve?

    <p>To identify the best course of action. (B)</p> Signup and view all the answers

    What is a primary activity of predictive analytics?

    <p>Forecasting future events (C)</p> Signup and view all the answers

    What type of analysis is suitable for performing drill down and roll-up analysis?

    <p>Diagnostic analytics (B)</p> Signup and view all the answers

    Which of the following best describes data analytics?

    <p>The management of the complete data lifecycle. (B)</p> Signup and view all the answers

    Flashcards

    Data Technology (DataTech)

    The use of technology to manage, analyze, and create value from data, encompassing areas like marketing and advertising.

    Big Data

    The collection, processing, and analysis of massive datasets, often diverse and rapidly changing, to uncover valuable insights.

    Data Variety

    The different types and formats of data that make up Big Data, including structured, semi-structured, and unstructured data.

    Data Velocity

    The speed at which data is generated, collected, and processed in Big Data.

    Signup and view all the flashcards

    Data Volume

    The sheer size and scale of data in Big Data, often exceeding the capabilities of traditional data management systems.

    Signup and view all the flashcards

    Data Veracity

    The accuracy and reliability of data in Big Data, often a challenge due to inconsistencies and uncertainty.

    Signup and view all the flashcards

    Data Value

    Describes the potential value that can be extracted from Big Data, such as improved decision-making, new business models, and enhanced efficiency.

    Signup and view all the flashcards

    Data Integration

    A combination of related datasets from different sources, analyzed together to reveal hidden patterns and insights.

    Signup and view all the flashcards

    Unstructured Data

    Data that does not conform to a predefined data model or schema. It lacks a rigid structure. Examples include text files, images, audio, and video.

    Signup and view all the flashcards

    Semi-Structured Data

    Data that has some level of structure but is not relational. It's organized hierarchically or in a graph-like manner. Examples include XML and JSON files.

    Signup and view all the flashcards

    Structured Data

    Data that is organized in a structured format, often in rows and columns. It's easily analyzed using traditional methods. Examples include databases.

    Signup and view all the flashcards

    Big Data Analytics

    The process of analyzing large datasets to identify patterns, trends, and insights. This helps businesses make better decisions and solve problems.

    Signup and view all the flashcards

    Big Data Ecosystem

    The ecosystem encompasses the tools, technologies, and processes used for managing, analyzing, and extracting value from large datasets.

    Signup and view all the flashcards

    Data Analysis

    Understanding and interpreting patterns, trends, insights, and relationships within data to make informed decisions.

    Signup and view all the flashcards

    Dataset

    A collection of related data points with similar characteristics.

    Signup and view all the flashcards

    Operational Optimization

    Optimizing processes, improving efficiency, and enhancing performance based on data insights.

    Signup and view all the flashcards

    Identification of New Markets

    Discovering new potential customers, markets, or opportunities based on data analysis.

    Signup and view all the flashcards

    Accurate Predictions

    Using data to predict future outcomes, trends, or events.

    Signup and view all the flashcards

    Fault and Fraud Detection

    Analyzing data to detect unusual patterns or anomalies that might indicate errors, fraud, or security threats.

    Signup and view all the flashcards

    Improved Decision-making

    Making informed, data-backed choices to enhance decision-making quality.

    Signup and view all the flashcards

    What is predictive analytics?

    Predictive analytics uses past data to forecast future outcomes.

    Signup and view all the flashcards

    What is a limitation of predictive analytics?

    Predictive models rely on existing conditions. If those conditions change, the models need updating.

    Signup and view all the flashcards

    What is prescriptive analytics?

    It goes beyond predictions, recommending specific actions to take.

    Signup and view all the flashcards

    What makes prescriptive analytics unique?

    Prescriptive analytics considers not only the best action but also the reasons behind it.

    Signup and view all the flashcards

    What is structured data?

    This type of data follows a specific structure and format, often stored in tables.

    Signup and view all the flashcards

    What's an example of machine-generated data?

    Examples include web logs, sensor data, and machine telemetry.

    Signup and view all the flashcards

    What's an example of human-generated data?

    Examples include social media posts, emails, and blog comments.

    Signup and view all the flashcards

    Where does structured data typically come from?

    Structured data often comes from business applications like ERP and CRM systems.

    Signup and view all the flashcards

    Four general categories of analytics

    These are categorized as Descriptive, Diagnostic, Predictive and Prescriptive.

    Signup and view all the flashcards

    Descriptive Analytics

    Descriptive analytics focuses on understanding past events by summarizing and organizing information. It helps answer questions about what happened.

    Signup and view all the flashcards

    Descriptive Analytics Tools

    Descriptive analytics tools extract data from operational systems and present it through reports and dashboards to provide insights.

    Signup and view all the flashcards

    Diagnostic Analytics

    Diagnostic analytics delves into the reasons behind past events by uncovering the cause and effect relationships. It focuses on understanding 'why' things happened.

    Signup and view all the flashcards

    Drill-down and Roll-up Analysis

    Diagnostic analytics can be used to perform drill-down (deeper analysis) and roll-up (summarizing) analysis to understand data patterns.

    Signup and view all the flashcards

    Predictive Analytics

    Predictive analytics aims to predict future events by identifying trends and patterns from past data. It helps make predictions about what might happen.

    Signup and view all the flashcards

    Prescriptive Analytics

    Prescriptive analytics provides recommendations and actionable strategies based on predictive insights. It aims to advise on the best course of action to take.

    Signup and view all the flashcards

    Study Notes

    DSC650: Data Technology and Future Emergence

    • This course, DSC650, examines data technology and its future implications.
    • The first lecture (Lecture 1) focuses on a general overview of data technology.
    • The learning objective (CLO1) is for students to grasp fundamental concepts and practices in big data technology.

    1.1 Overview of Data Technology

    • The overview covers data technology evolution.
    • It details an introduction to big data.
    • The lecture explores the big data ecosystem.
    • The foundation of big data technology is also explained.
    • Related career outlook is discussed.

    Data Technology

    • Data technology (Data Tech) encompasses technologies associated with areas like martech and adtech.
    • Data Tech includes solutions for data management.
    • It involves products and services based on data generated by people and machines.
    • These technologies are used to manage large datasets, create data management solutions, and collect data from various sources for business insights.

    Data Technology Evolution

    • The diagram shows the evolution of data technologies: relational databases, traditional DBMS's, object-oriented and object-relational databases, NoSQL (Big Data), digital technologies, and intelligent DBMS's.
    • The evolution describes the progression from traditional database systems to more advanced big data solutions.
    • The diagram shows links between these technologies, suggesting their interrelationship in modern data handling.

    Big Data - An Introduction

    • Big Data is defined as the analysis, processing, and storage of large datasets.
    • Datasets often originate from various sources.
    • Data encompasses multiple unrelated datasets.
    • Processing involves large amounts of unstructured data.
    • The processing is time-sensitive and aims to extract hidden information.

    Big Data - Characteristics (5V)

    • Volume: Huge amount of data. Large volume signifies big data.
    • Velocity: High speed data accumulation, continuous data flow.
    • Variety: Data nature. Data is structured, semi-structured, and unstructured. Data sources are diverse.
    • Veracity: Data inconsistencies and uncertainties. Dealing with the variability of data.
    • Value: Extract valuable knowledge from the data. The data needs to be useful/valuable.

    1.2 Big Data - An Introduction

    • Big Data processing yields significant insights and benefits.
    • These benefits include operational optimization, actionable intelligence, identifying new markets, accurate predictions, fault and fraud detection, detailed records, improved decision-making, and scientific discoveries.

    Big Data Terminology: Datasets

    • Datasets are collections of related data.
    • Each data point within a dataset shares the same attributes or properties.
    • Examples include tweets (in a flat file), image files (in a directory), database table extracts (in CSV format), and historical weather data (in XML format).

    Big Data Terminology: Data Analysis

    • Data analysis is the process of examining data to identify facts, relationships, patterns, insights, and trends.
    • The overall objective of data analysis is to support better decision-making.
    • An example application of data analysis is analyzing ice cream sales data to determine sales volume related to daily temperature.
    • Real-world data analysis helps establish patterns and relationships in the data being examined.

    Big Data Terminology: Data Analytics

    • Data analytics is an expanded term for data analysis.
    • It encompasses the complete lifecycle of data, including collecting, cleansing, organizing, storing, analyzing, and governing data.
    • Data analytics describes the scope of comprehensive data management.

    Four General Categories of Analytics

    • Descriptive: Summarizes past data.
    • Diagnostic: Examines the cause and reason behind past events.
    • Predictive: Makes estimations regarding future events using past data patterns.
    • Prescriptive: Provides recommendations and optimal actions based on predictive analysis.

    Data Analytics: Descriptive Analytics

    • Examines past events to answer specific questions.
    • Summarizes/contextualizes data to produce insights.
    • Example questions include sales volume over the past year, support calls by severity/location, or monthly commissions by sales agent.

    Descriptive Analytics Tools

    • Operational systems (like OLTP, CRM, ERP) are used with descriptive analytics tools.
    • Reports and dashboards are created from these systems to visualize data.

    Data Analytics: Diagnostic Analytics

    • Aims to determine the cause of past events.
    • Analyzes reasons behind observed events.
    • Example questions include the factors behind reduced Q2 sales compared to Q1, support calls increasing in a particular region, or the reasons for a rise in patient readmissions.
    • The analysis uncovers the reasons why the phenomenon occurred.

    Data Analytics: Predictive Analytics

    • Predicts future outcomes based on past events.
    • Enhances meaning from information to understand relationships.
    • Models used in predictive analytics rely on past situations' conditions.
    • Models need adjustments if the underlying conditions change.
    • Example questions include predicting loan defaults, patient survival rates, or whether a customer will purchase a product.

    Data Analytics: Prescriptive Analytics

    • Builds on predictive results to suggest actions.
    • Determines best actions to take.
    • Offers insights based on potential scenarios and risk mitigation.
    • Examines various results and potential factors.
    • Example questions include the best drug among options, the best time to trade a stock, and optimal risk mitigation strategies.

    Big Data Usages

    • Data usage across industries reveals notable applications of these technologies.
    • The chart visually displays industry-specific usage rates.
    • These usage rates suggest big data's extensive application across multiple sectors.

    Types of Data: Structured Data

    • Conforms to data models or schemas, typically stored in tabular form.
    • Used to record relationships between entities.
    • Commonly found in relational databases used by enterprise applications such as ERP and CRM systems.
    • Examples include banking transactions, invoices, and customer details.

    Types of Data: Unstructured Data

    • Does not adhere to data models or schemas.
    • Represents the greater part (estimated 80%) of data in a company.
    • Has a faster growth rate than structured data.
    • Often textual or binary data formats like text files, images, audio, and video.
    • The classification depends on data format, not its actual content.

    Types of Data: Semi-structured Data

    • Contains a degree of structure and consistency but lacks full relational format.
    • Commonly stored in hierarchical or graph-based formats.
    • Examples include XML and JSON files, EDI files, spreadsheets, and sensor data.
    • Semi-structured data is easier to process than unstructured data due to its structural elements.

    Big Data Ecosystem

    • This diagram illustrates the ecosystem.
    • Data sources are detailed in the figure, examples included SAP, PeopleSoft, etc.
    • Ingestion methods (e.g., MQ Series, Informatica) and storage methods (e.g., EDW, OLAP, HDFS) are illustrated.
    • Exploration methods and consumption methods (e.g., custom solutions, parameterised reports, dashboards) are shown..

    Big Data Architecture - Technology Foundation

    • Displays the layers of a big data architecture, starting from internet feeds and applications.
    • Different kinds of databases (structured, unstructured, semi-structured) are used at the operational base.
    • The architecture contains features like interfaces, security systems, and redundant physical infrastructure.

    Big Data Career Path

    • This section displays various big data job titles and their expected salary ranges.

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Related Documents

    Description

    This quiz focuses on the first lecture of DSC650, which provides an overview of data technology. It covers the evolution of data technology, big data introduction, and associated career outlook. Explore the foundational concepts and practices that shape the future of big data technology.

    More Like This

    Big Data Fundamentals
    5 questions

    Big Data Fundamentals

    HumbleAwareness avatar
    HumbleAwareness
    Big Data Management Challenges
    18 questions
    Big Data Management Challenges
    10 questions
    Use Quizgecko on...
    Browser
    Browser