Introduction to Data Science
36 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which of the following is NOT considered data?

  • Conclusions (correct)
  • Observations
  • Figures
  • Raw facts
  • Data can only be obtained from scientific experiments.

    False

    Why is data important in data science?

    Data is important because it serves as the foundation for analysis, leading to insights and informed decisions.

    A ______ typically refers to a visual representation of data.

    <p>figure</p> Signup and view all the answers

    What percentage of users prefer using a mobile app for online shopping according to the survey?

    <p>80%</p> Signup and view all the answers

    Observations are considered raw data and serve as the basis for analysis.

    <p>True</p> Signup and view all the answers

    Give an example of a figure that is used in data analysis.

    <p>A scatter plot showing the relationship between study hours and exam scores.</p> Signup and view all the answers

    Match the types of data with their definitions:

    <p>Raw facts = Specific pieces of information derived from analysis Figures = Visual representations of data Observations = Specific instances or data points collected Surveys = Method for gathering user preferences or behavioral data</p> Signup and view all the answers

    Which customer had the highest satisfaction rating?

    <p>Customer ID: 23456</p> Signup and view all the answers

    Structured data is easily searchable and analyzable by algorithms.

    <p>True</p> Signup and view all the answers

    What comment did Customer ID 12345 make about their purchase?

    <p>The product arrived earlier than expected. Very satisfied with the service.</p> Signup and view all the answers

    Customer ID 67890 rated their satisfaction as a _____ on a scale of 1 to 5.

    <p>2</p> Signup and view all the answers

    Match the following types of data with their characteristics:

    <p>Structured Data = Organized in a specific format Unstructured Data = Not organized and difficult to analyze Numerical Data = Represented in numbers Text Data = Composed of words and phrases</p> Signup and view all the answers

    Which of the following is NOT an example of data mentioned?

    <p>Every product sold</p> Signup and view all the answers

    All observations from customers can help identify patterns in customer satisfaction.

    <p>True</p> Signup and view all the answers

    What specific complaint did Customer ID 67890 mention?

    <p>The product quality was not as expected.</p> Signup and view all the answers

    Which of the following is an example of structured data?

    <p>Databases</p> Signup and view all the answers

    Unstructured data has a pre-defined data model.

    <p>False</p> Signup and view all the answers

    What data format uses keys and values and can represent a hierarchical structure?

    <p>JSON</p> Signup and view all the answers

    A _______________ is a plain text file format where values are separated by commas.

    <p>CSV</p> Signup and view all the answers

    Match the following data types with their characteristics:

    <p>SQL = Structured data storage Hadoop = Non-relational database NLP = Extracting insights from text Images = Visual data type</p> Signup and view all the answers

    Which type of data often requires unstructured data analytics programs for processing?

    <p>Text documents</p> Signup and view all the answers

    All structured data can be easily organized into a tabular format.

    <p>True</p> Signup and view all the answers

    Name one type of unstructured data used in data science.

    <p>Text data</p> Signup and view all the answers

    Which of the following is NOT an example of unstructured data?

    <p>Sales revenue data</p> Signup and view all the answers

    Big data is characterized by higher volume, variety, and velocity compared to traditional data.

    <p>True</p> Signup and view all the answers

    What technique is used to analyze unstructured video data?

    <p>Video analysis techniques</p> Signup and view all the answers

    Quantitative data is often measured in ______.

    <p>numbers</p> Signup and view all the answers

    Match the types of unstructured data with their analysis techniques:

    <p>Audio Data = Speech recognition Video Data = Object detection Social Media Data = Social media analytics Web Data = Web scraping</p> Signup and view all the answers

    Which method is commonly applied to extract information from unstructured web data?

    <p>Web scraping</p> Signup and view all the answers

    Qualitative data is primarily numerical and can be counted.

    <p>False</p> Signup and view all the answers

    List one example of quantitative data.

    <p>Age, weight, revenue, or sales numbers.</p> Signup and view all the answers

    What is the primary purpose of quantitative data?

    <p>To perform numerical calculations and statistical inferences</p> Signup and view all the answers

    Qualitative data is often based on numerical values and can be easily measured.

    <p>False</p> Signup and view all the answers

    Name one method of collecting qualitative data.

    <p>Interviews</p> Signup and view all the answers

    Qualitative data often appears in __________ form.

    <p>narrative</p> Signup and view all the answers

    Study Notes

    Introduction to Data Science

    • Data science relies on data as the foundation for analysis.
    • Data, in the context of data science, encompasses raw facts, figures, observations, and information.
    • Data is collected, stored and analyzed to achieve insights, support informed decision-making and enhance various applications.
    • Diverse sources include sensors, surveys, social media, business transactions and scientific experiments.
    • Data examples: the average customer age in a dataset is 35, product sales increased 20% last quarter or 80% of users prefer a mobile app for online shopping.

    What is Data: Raw Facts

    • Raw facts are specific pieces of information deduced from data analysis.
    • Examples of raw facts: the average age of customers within a dataset is 35 years old, product sales have increased by 20% compared to the previous quarter, or 80% of users prefer mobile apps for online shopping

    What is Data: Figures

    • Figures usually represent visualizations of data, illustrating patterns, trends and relationships within a dataset.
    • Example: a scatter plot used to illustrate the relation between study hours and exam scores.
    • Scatter plots visually show the connection between these two factors.

    What is Data: Observations

    • Observations refer to specific instances or data points gathered from experiments, surveys, studies or other research activities.
    • Observations form the bedrock for data analysis.
    • Example: Observations related to customer satisfaction surveys contain customer ID, date of purchase, satisfaction rating (on a scale of 1-5), and comments.

    Data Sources

    • Data can come from various sources, including computers, mobile devices, cameras, sensors, wearables, social media interactions, saved files, photos and queries.
    • Even simple actions like getting directions generate data.

    Types of Data

    • Data exists in numerous forms: numbers, text, images, audio, video and more.
    • In data science, data types are categorized as structured and unstructured.

    Structured Data

    • Structured data is organized in a defined format, which algorithms easily process.
    • This format facilitates searching and analysis for both humans and machines.
    • Structured data is suitable for quantitative analysis.
    • It's often tabular, employing rows and columns that explicitly define data attributes.

    Structured Data Examples

    • Databases: Information stored in relational databases, organized as tables with rows and columns.

    • Examples are: MySQL, PostgreSQL.

    • Spreadsheets: Data arranged in rows and columns in spreadsheets such as Microsoft Excel or Google Sheets.

    • CSV (Comma-Separated Values) Files: Plain text files where values are separated by commas and each line represents a new record.

    • JSON (JavaScript Object Notation) Data: Data presented using key-value pairs organized in a hierarchical structure.

    • XML (eXtensible Markup Language) Data: Data presented using tags in a hierarchical structure suitable for complex and nested relationships.

    Unstructured Data

    • Unstructured data lacks a predefined data model or format.
    • It's less easily organized into tables than structured data.
    • It comes in diverse formats like text documents, images, audio files, videos, emails and social media posts.

    Unstructured Data Examples

    • Text Data: Textual information from various sources such as articles, books, emails, social media posts and customer reviews.
    • Image Data: Photographs, diagrams, satellite images and other visual data.
    • Audio Data: Recordings of voice conversations, music, podcasts and other audio sources.
    • Video Data: Moving images captured by cameras or videos.
    • Social Media Feeds: Data from social media platforms including posts, comments and multimedia content.
    • Web Pages: Content from websites, articles, blogs and forum posts.

    Structured vs. Unstructured Data Comparison

    • Structured data is readily displayed in rows and columns of relational databases.
    • Structured data comprises numbers, dates and strings. It represents approximately 20% of enterprise data.
    • Structured data utilizes less storage and offers easier management and protection using legacy solutions.
    • Unstructured data can't be displayed in rows and columns as it lacks a defined format.
    • Unstructured data includes images, audio, video, emails, and spreadsheets. It represents approximately 80% of enterprise data.
    • Unstructured data requires more storage and is more complex to manage and protect using legacy solutions.

    Big Data

    • Big data is more extensive than traditional data.
    • It's recognized by its diversity (text, images, audio etc.), velocity (real-time retrieval/computation), and volume (measured in tera, peta and exabytes). This massive data is dispersed across a computer network.

    Quantitative vs Qualitative Data

    • Quantitative data includes numerical measurements. It's expressed using numbers.
    • Qualitative data relies on observations and descriptions. It's not easily expressed in numbers.

    Quantitative Data

    • Nature: Numerical data consisting of measurable values.
    • Measurement: Using precise and standardized procedures. Units of measurement are included (e.g., kilograms).
    • Analysis: Statistical methods like mean, median, standard deviation and visualization methods such as histograms, bar graphs, line graphs and scatter plots.
    • Example: Age (25 years), weight (70kg), revenue ($5,000) or quantity (100 units).

    Qualitative Data

    • Nature: Descriptive, focusing on qualities and characteristics. Non-numerical, relying on categorical or textual information.
    • Measurement: Subjective, based on insights, opinions, attitudes and often presented in a narrative format. Categorized into groups or classes.
    • Analysis: Narrative analysis, involving the examination of stories, texts and other qualitative data for valuable insights.
    • Example: Marital status (Married), customer feedback ("Satisfied"), or colors (red, blue, green).

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Related Documents

    Description

    This quiz covers the foundational concepts of data science, including the role of data, definitions, and specific types of data such as raw facts and figures. Test your understanding of how data is collected, analyzed, and utilized to derive insights and inform decisions in various applications.

    More Like This

    Use Quizgecko on...
    Browser
    Browser