Introduction to Data Science
36 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which of the following is NOT considered data?

  • Conclusions (correct)
  • Observations
  • Figures
  • Raw facts

Data can only be obtained from scientific experiments.

False (B)

Why is data important in data science?

Data is important because it serves as the foundation for analysis, leading to insights and informed decisions.

A ______ typically refers to a visual representation of data.

<p>figure</p> Signup and view all the answers

What percentage of users prefer using a mobile app for online shopping according to the survey?

<p>80% (C)</p> Signup and view all the answers

Observations are considered raw data and serve as the basis for analysis.

<p>True (A)</p> Signup and view all the answers

Give an example of a figure that is used in data analysis.

<p>A scatter plot showing the relationship between study hours and exam scores.</p> Signup and view all the answers

Match the types of data with their definitions:

<p>Raw facts = Specific pieces of information derived from analysis Figures = Visual representations of data Observations = Specific instances or data points collected Surveys = Method for gathering user preferences or behavioral data</p> Signup and view all the answers

Which customer had the highest satisfaction rating?

<p>Customer ID: 23456 (C)</p> Signup and view all the answers

Structured data is easily searchable and analyzable by algorithms.

<p>True (A)</p> Signup and view all the answers

What comment did Customer ID 12345 make about their purchase?

<p>The product arrived earlier than expected. Very satisfied with the service.</p> Signup and view all the answers

Customer ID 67890 rated their satisfaction as a _____ on a scale of 1 to 5.

<p>2</p> Signup and view all the answers

Match the following types of data with their characteristics:

<p>Structured Data = Organized in a specific format Unstructured Data = Not organized and difficult to analyze Numerical Data = Represented in numbers Text Data = Composed of words and phrases</p> Signup and view all the answers

Which of the following is NOT an example of data mentioned?

<p>Every product sold (A)</p> Signup and view all the answers

All observations from customers can help identify patterns in customer satisfaction.

<p>True (A)</p> Signup and view all the answers

What specific complaint did Customer ID 67890 mention?

<p>The product quality was not as expected.</p> Signup and view all the answers

Which of the following is an example of structured data?

<p>Databases (B)</p> Signup and view all the answers

Unstructured data has a pre-defined data model.

<p>False (B)</p> Signup and view all the answers

What data format uses keys and values and can represent a hierarchical structure?

<p>JSON</p> Signup and view all the answers

A _______________ is a plain text file format where values are separated by commas.

<p>CSV</p> Signup and view all the answers

Match the following data types with their characteristics:

<p>SQL = Structured data storage Hadoop = Non-relational database NLP = Extracting insights from text Images = Visual data type</p> Signup and view all the answers

Which type of data often requires unstructured data analytics programs for processing?

<p>Text documents (C)</p> Signup and view all the answers

All structured data can be easily organized into a tabular format.

<p>True (A)</p> Signup and view all the answers

Name one type of unstructured data used in data science.

<p>Text data</p> Signup and view all the answers

Which of the following is NOT an example of unstructured data?

<p>Sales revenue data (C)</p> Signup and view all the answers

Big data is characterized by higher volume, variety, and velocity compared to traditional data.

<p>True (A)</p> Signup and view all the answers

What technique is used to analyze unstructured video data?

<p>Video analysis techniques</p> Signup and view all the answers

Quantitative data is often measured in ______.

<p>numbers</p> Signup and view all the answers

Match the types of unstructured data with their analysis techniques:

<p>Audio Data = Speech recognition Video Data = Object detection Social Media Data = Social media analytics Web Data = Web scraping</p> Signup and view all the answers

Which method is commonly applied to extract information from unstructured web data?

<p>Web scraping (A)</p> Signup and view all the answers

Qualitative data is primarily numerical and can be counted.

<p>False (B)</p> Signup and view all the answers

List one example of quantitative data.

<p>Age, weight, revenue, or sales numbers.</p> Signup and view all the answers

What is the primary purpose of quantitative data?

<p>To perform numerical calculations and statistical inferences (A)</p> Signup and view all the answers

Qualitative data is often based on numerical values and can be easily measured.

<p>False (B)</p> Signup and view all the answers

Name one method of collecting qualitative data.

<p>Interviews</p> Signup and view all the answers

Qualitative data often appears in __________ form.

<p>narrative</p> Signup and view all the answers

Flashcards

Data in Data Science

Raw facts, figures, observations, or information used to gain insights, make decisions, and support various applications.

Raw Fact (Data Science)

Specific information derived from analyzing data. Examples include average age or sales increases.

Figure (Data)

Visual representation of data, illustrating patterns, trends, or relationships.

Observation (Data)

Specific data point collected or recorded during research or an experiment.

Signup and view all the flashcards

Data Sources

Diverse places where data comes from, including sensors, surveys, social media, business transactions, and experiments.

Signup and view all the flashcards

Examples of Raw Facts

Specific, numerical results derived from data analysis. For example, average ages or percentage changes in sales.

Signup and view all the flashcards

Data Analysis Purpose

Gaining insights, making decisions, and supporting various applications.

Signup and view all the flashcards

Data in Data Science

Raw facts, figures, observations, or information used to gain insights, make decisions, and support various applications.

Signup and view all the flashcards

Customer Observation

A record of customer feedback, including details like customer ID, purchase date, satisfaction rating, and comments.

Signup and view all the flashcards

Satisfaction Rating

A numerical score (e.g., 1 to 5) indicating a customer's level of satisfaction with a product or service.

Signup and view all the flashcards

Structured Data

Data organized in a format machines can easily process (e.g., tables, databases).

Signup and view all the flashcards

Observation

A single piece of data about a customer or product, in the context of customer feedback.

Signup and view all the flashcards

Data Science

The study of extracting useful information from data.

Signup and view all the flashcards

Customer ID

A unique identifier for each customer.

Signup and view all the flashcards

Date of Purchase

The date when a customer made a purchase.

Signup and view all the flashcards

Unstructured Data

Data that isn't organized in a specific format (e.g., text, images, audio).

Signup and view all the flashcards

Example of Structured Data: Databases

Data stored in tables with rows and columns. Each column represents an attribute (e.g., name, age), and each row is a record.

Signup and view all the flashcards

Example of Structured Data: Spreadsheets

Data in rows and columns. Used often in software like Excel.

Signup and view all the flashcards

Example of Structured Data: CSV

Data in plain text with values separated by commas. Each line is a record.

Signup and view all the flashcards

Example of Structured Data: JSON

Data using key-value pairs that can be nested.

Signup and view all the flashcards

Example of Structured Data: XML

Data structured using tags and can contain nested information.

Signup and view all the flashcards

Unstructured Data Example: Text Data

Textual data from various sources like social media, emails, and more.

Signup and view all the flashcards

Qualitative Data

Information that cannot be easily counted or measured with numbers. It's often collected through interviews, surveys, or observations and is expressed in words, descriptions, or narratives.

Signup and view all the flashcards

Audio Data

Recordings of sound, voice conversations, music, etc. Needs special processing for analysis.

Signup and view all the flashcards

Quantitative Data

Information that can be measured and expressed numerically. It's used to make calculations and statistical inferences.

Signup and view all the flashcards

Descriptive Data

A type of qualitative data that describes qualities or characteristics, often using words or categories.

Signup and view all the flashcards

Video Data

Moving images, scenes from cameras or videos. Analysis involves object detection and motion tracking.

Signup and view all the flashcards

Subjective Data

Qualitative data that is based on personal opinions, perceptions, or judgments. It's not always objective or factual.

Signup and view all the flashcards

Social Media Feeds

Data from social media platforms including posts, comments, and media like images.

Signup and view all the flashcards

Web Pages

Content from websites, blogs, or forums, often needing extraction tools to analyze.

Signup and view all the flashcards

Narrative Analysis

A way to analyze qualitative data by examining stories, texts, or narratives to uncover underlying meanings, themes, or perspectives.

Signup and view all the flashcards

Quantitative Data

Data that can be counted or measured using numbers. Includes units!

Signup and view all the flashcards

Examples of Quantitative Data

Numerical data that includes age, weight, or revenue, etc., usually involving specific units.

Signup and view all the flashcards

Quantitative Data Measurement

Precise measurement of numerical data using standardized methods with units.

Signup and view all the flashcards

Study Notes

Introduction to Data Science

  • Data science relies on data as the foundation for analysis.
  • Data, in the context of data science, encompasses raw facts, figures, observations, and information.
  • Data is collected, stored and analyzed to achieve insights, support informed decision-making and enhance various applications.
  • Diverse sources include sensors, surveys, social media, business transactions and scientific experiments.
  • Data examples: the average customer age in a dataset is 35, product sales increased 20% last quarter or 80% of users prefer a mobile app for online shopping.

What is Data: Raw Facts

  • Raw facts are specific pieces of information deduced from data analysis.
  • Examples of raw facts: the average age of customers within a dataset is 35 years old, product sales have increased by 20% compared to the previous quarter, or 80% of users prefer mobile apps for online shopping

What is Data: Figures

  • Figures usually represent visualizations of data, illustrating patterns, trends and relationships within a dataset.
  • Example: a scatter plot used to illustrate the relation between study hours and exam scores.
  • Scatter plots visually show the connection between these two factors.

What is Data: Observations

  • Observations refer to specific instances or data points gathered from experiments, surveys, studies or other research activities.
  • Observations form the bedrock for data analysis.
  • Example: Observations related to customer satisfaction surveys contain customer ID, date of purchase, satisfaction rating (on a scale of 1-5), and comments.

Data Sources

  • Data can come from various sources, including computers, mobile devices, cameras, sensors, wearables, social media interactions, saved files, photos and queries.
  • Even simple actions like getting directions generate data.

Types of Data

  • Data exists in numerous forms: numbers, text, images, audio, video and more.
  • In data science, data types are categorized as structured and unstructured.

Structured Data

  • Structured data is organized in a defined format, which algorithms easily process.
  • This format facilitates searching and analysis for both humans and machines.
  • Structured data is suitable for quantitative analysis.
  • It's often tabular, employing rows and columns that explicitly define data attributes.

Structured Data Examples

  • Databases: Information stored in relational databases, organized as tables with rows and columns.

  • Examples are: MySQL, PostgreSQL.

  • Spreadsheets: Data arranged in rows and columns in spreadsheets such as Microsoft Excel or Google Sheets.

  • CSV (Comma-Separated Values) Files: Plain text files where values are separated by commas and each line represents a new record.

  • JSON (JavaScript Object Notation) Data: Data presented using key-value pairs organized in a hierarchical structure.

  • XML (eXtensible Markup Language) Data: Data presented using tags in a hierarchical structure suitable for complex and nested relationships.

Unstructured Data

  • Unstructured data lacks a predefined data model or format.
  • It's less easily organized into tables than structured data.
  • It comes in diverse formats like text documents, images, audio files, videos, emails and social media posts.

Unstructured Data Examples

  • Text Data: Textual information from various sources such as articles, books, emails, social media posts and customer reviews.
  • Image Data: Photographs, diagrams, satellite images and other visual data.
  • Audio Data: Recordings of voice conversations, music, podcasts and other audio sources.
  • Video Data: Moving images captured by cameras or videos.
  • Social Media Feeds: Data from social media platforms including posts, comments and multimedia content.
  • Web Pages: Content from websites, articles, blogs and forum posts.

Structured vs. Unstructured Data Comparison

  • Structured data is readily displayed in rows and columns of relational databases.
  • Structured data comprises numbers, dates and strings. It represents approximately 20% of enterprise data.
  • Structured data utilizes less storage and offers easier management and protection using legacy solutions.
  • Unstructured data can't be displayed in rows and columns as it lacks a defined format.
  • Unstructured data includes images, audio, video, emails, and spreadsheets. It represents approximately 80% of enterprise data.
  • Unstructured data requires more storage and is more complex to manage and protect using legacy solutions.

Big Data

  • Big data is more extensive than traditional data.
  • It's recognized by its diversity (text, images, audio etc.), velocity (real-time retrieval/computation), and volume (measured in tera, peta and exabytes). This massive data is dispersed across a computer network.

Quantitative vs Qualitative Data

  • Quantitative data includes numerical measurements. It's expressed using numbers.
  • Qualitative data relies on observations and descriptions. It's not easily expressed in numbers.

Quantitative Data

  • Nature: Numerical data consisting of measurable values.
  • Measurement: Using precise and standardized procedures. Units of measurement are included (e.g., kilograms).
  • Analysis: Statistical methods like mean, median, standard deviation and visualization methods such as histograms, bar graphs, line graphs and scatter plots.
  • Example: Age (25 years), weight (70kg), revenue ($5,000) or quantity (100 units).

Qualitative Data

  • Nature: Descriptive, focusing on qualities and characteristics. Non-numerical, relying on categorical or textual information.
  • Measurement: Subjective, based on insights, opinions, attitudes and often presented in a narrative format. Categorized into groups or classes.
  • Analysis: Narrative analysis, involving the examination of stories, texts and other qualitative data for valuable insights.
  • Example: Marital status (Married), customer feedback ("Satisfied"), or colors (red, blue, green).

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

Description

This quiz covers the foundational concepts of data science, including the role of data, definitions, and specific types of data such as raw facts and figures. Test your understanding of how data is collected, analyzed, and utilized to derive insights and inform decisions in various applications.

More Like This

Use Quizgecko on...
Browser
Browser