Introduction to Big Data Concepts
8 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is the estimated global volume of data generated by 2025?

  • 250 zettabytes
  • 200 zettabytes
  • 175 zettabytes (correct)
  • 150 zettabytes
  • Which of the following best describes the data type that accounts for 80% of global data?

  • Unstructured data (correct)
  • Hierarchical data
  • Transactional data
  • Structured data
  • How is data typically divided and distributed in Hadoop Distributed File System (HDFS)?

  • Into small blocks of 128 MB or 256 MB (correct)
  • In whole files of 1 GB
  • In random unspecified sizes
  • Into chunks of 64 MB
  • Which of the following is NOT one of the 5 Vs of Big Data?

    <p>Variability</p> Signup and view all the answers

    What percentage of global data is stored in relational databases?

    <p>Less than 20%</p> Signup and view all the answers

    What defines a data lake in the context of data storage?

    <p>A place where raw data is stored without transformation</p> Signup and view all the answers

    Which source generates approximately 500,000 units of data per minute?

    <p>Tweets</p> Signup and view all the answers

    What is one of the main characteristics of Veracity in the 5 Vs of Big Data?

    <p>The trustworthiness and authenticity of data</p> Signup and view all the answers

    Study Notes

    Big Data

    • Data is crucial for decision-making in all business aspects.
    • In 2025, the world will produce 175 zettabytes of data (1 ZB = 1 billion gigabytes).
    • In 2010, data production was significantly lower.
    • 2.5 million GB of data is created daily by internet users.
    • Recent years have seen 90% of global data creation.

    The 5 Vs of Big Data

    • Velocity: Batch, near real-time, real-time, and streaming data
    • Variety: Structured, unstructured, and semi-structured data, including different formats (e.g. tables, files, transactions)
    • Volume: Gigabytes, terabytes, and petabytes of data
    • Veracity: Trustworthiness, authenticity, origin, reputation, accountability in the data
    • Value: Statistical analysis, event identification, correlations, and potential insights from data

    Sources of Data

    • Facebook: 500,000 tweets per minute
    • Twitter: 500,000 posts per minute
    • Instagram: 347,222 posts per minute
    • Internet of Things (IoT): 75 million connected devices generating data and sensor readings

    Storage of Data

    • Less than 20% of global data is stored in relational databases (important for businesses like banks, hospitals, and customers).
    • 80% of data is unstructured (text, images, videos) and is stored in big data architectures and NoSQL databases.

    Big Data Storage Technologies (Hadoop Distributed File System - HDFS)

    • Divides data into small blocks (128 MB or 256 MB) and distributes them across multiple servers.
    • Provides redundancy (copies of data) to ensure data safety.
    • Ideal for storing large amounts of unstructured or semi-structured data.

    Data Lakes

    • Centralized repository storing all types of data (structured, semi-structured, and unstructured) in raw format.
    • Used when needing long-term storage and analysis of diverse, raw data.
    • Ideal when the specific analysis type is unknown.

    NoSQL Databases

    • Designed for flexibility, high speed and handling unstructured data.

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Related Documents

    Big Data Overview PDF

    Description

    Explore the transformative nature of big data and its crucial role in decision-making across various business functions. Learn about the 5 Vs of Big Data: Velocity, Variety, Volume, Veracity, and Value, and understand the exponential growth of data generated from different sources. Test your knowledge on how big data influences modern analytics.

    More Like This

    Business Intelligence Basics
    10 questions
    Data Mining Techniques and Applications Quiz
    10 questions
    Introducción al Big Data
    4 questions
    Datenanalyse Grundlagen
    21 questions

    Datenanalyse Grundlagen

    KnowledgeableObsidian avatar
    KnowledgeableObsidian
    Use Quizgecko on...
    Browser
    Browser