Podcast
Questions and Answers
Which of the following is NOT considered data?
Which of the following is NOT considered data?
Data can only be obtained from scientific experiments.
Data can only be obtained from scientific experiments.
False
Why is data important in data science?
Why is data important in data science?
Data is important because it serves as the foundation for analysis, leading to insights and informed decisions.
A ______ typically refers to a visual representation of data.
A ______ typically refers to a visual representation of data.
Signup and view all the answers
What percentage of users prefer using a mobile app for online shopping according to the survey?
What percentage of users prefer using a mobile app for online shopping according to the survey?
Signup and view all the answers
Observations are considered raw data and serve as the basis for analysis.
Observations are considered raw data and serve as the basis for analysis.
Signup and view all the answers
Give an example of a figure that is used in data analysis.
Give an example of a figure that is used in data analysis.
Signup and view all the answers
Match the types of data with their definitions:
Match the types of data with their definitions:
Signup and view all the answers
Which customer had the highest satisfaction rating?
Which customer had the highest satisfaction rating?
Signup and view all the answers
Structured data is easily searchable and analyzable by algorithms.
Structured data is easily searchable and analyzable by algorithms.
Signup and view all the answers
What comment did Customer ID 12345 make about their purchase?
What comment did Customer ID 12345 make about their purchase?
Signup and view all the answers
Customer ID 67890 rated their satisfaction as a _____ on a scale of 1 to 5.
Customer ID 67890 rated their satisfaction as a _____ on a scale of 1 to 5.
Signup and view all the answers
Match the following types of data with their characteristics:
Match the following types of data with their characteristics:
Signup and view all the answers
Which of the following is NOT an example of data mentioned?
Which of the following is NOT an example of data mentioned?
Signup and view all the answers
All observations from customers can help identify patterns in customer satisfaction.
All observations from customers can help identify patterns in customer satisfaction.
Signup and view all the answers
What specific complaint did Customer ID 67890 mention?
What specific complaint did Customer ID 67890 mention?
Signup and view all the answers
Which of the following is an example of structured data?
Which of the following is an example of structured data?
Signup and view all the answers
Unstructured data has a pre-defined data model.
Unstructured data has a pre-defined data model.
Signup and view all the answers
What data format uses keys and values and can represent a hierarchical structure?
What data format uses keys and values and can represent a hierarchical structure?
Signup and view all the answers
A _______________ is a plain text file format where values are separated by commas.
A _______________ is a plain text file format where values are separated by commas.
Signup and view all the answers
Match the following data types with their characteristics:
Match the following data types with their characteristics:
Signup and view all the answers
Which type of data often requires unstructured data analytics programs for processing?
Which type of data often requires unstructured data analytics programs for processing?
Signup and view all the answers
All structured data can be easily organized into a tabular format.
All structured data can be easily organized into a tabular format.
Signup and view all the answers
Name one type of unstructured data used in data science.
Name one type of unstructured data used in data science.
Signup and view all the answers
Which of the following is NOT an example of unstructured data?
Which of the following is NOT an example of unstructured data?
Signup and view all the answers
Big data is characterized by higher volume, variety, and velocity compared to traditional data.
Big data is characterized by higher volume, variety, and velocity compared to traditional data.
Signup and view all the answers
What technique is used to analyze unstructured video data?
What technique is used to analyze unstructured video data?
Signup and view all the answers
Quantitative data is often measured in ______.
Quantitative data is often measured in ______.
Signup and view all the answers
Match the types of unstructured data with their analysis techniques:
Match the types of unstructured data with their analysis techniques:
Signup and view all the answers
Which method is commonly applied to extract information from unstructured web data?
Which method is commonly applied to extract information from unstructured web data?
Signup and view all the answers
Qualitative data is primarily numerical and can be counted.
Qualitative data is primarily numerical and can be counted.
Signup and view all the answers
List one example of quantitative data.
List one example of quantitative data.
Signup and view all the answers
What is the primary purpose of quantitative data?
What is the primary purpose of quantitative data?
Signup and view all the answers
Qualitative data is often based on numerical values and can be easily measured.
Qualitative data is often based on numerical values and can be easily measured.
Signup and view all the answers
Name one method of collecting qualitative data.
Name one method of collecting qualitative data.
Signup and view all the answers
Qualitative data often appears in __________ form.
Qualitative data often appears in __________ form.
Signup and view all the answers
Study Notes
Introduction to Data Science
- Data science relies on data as the foundation for analysis.
- Data, in the context of data science, encompasses raw facts, figures, observations, and information.
- Data is collected, stored and analyzed to achieve insights, support informed decision-making and enhance various applications.
- Diverse sources include sensors, surveys, social media, business transactions and scientific experiments.
- Data examples: the average customer age in a dataset is 35, product sales increased 20% last quarter or 80% of users prefer a mobile app for online shopping.
What is Data: Raw Facts
- Raw facts are specific pieces of information deduced from data analysis.
- Examples of raw facts: the average age of customers within a dataset is 35 years old, product sales have increased by 20% compared to the previous quarter, or 80% of users prefer mobile apps for online shopping
What is Data: Figures
- Figures usually represent visualizations of data, illustrating patterns, trends and relationships within a dataset.
- Example: a scatter plot used to illustrate the relation between study hours and exam scores.
- Scatter plots visually show the connection between these two factors.
What is Data: Observations
- Observations refer to specific instances or data points gathered from experiments, surveys, studies or other research activities.
- Observations form the bedrock for data analysis.
- Example: Observations related to customer satisfaction surveys contain customer ID, date of purchase, satisfaction rating (on a scale of 1-5), and comments.
Data Sources
- Data can come from various sources, including computers, mobile devices, cameras, sensors, wearables, social media interactions, saved files, photos and queries.
- Even simple actions like getting directions generate data.
Types of Data
- Data exists in numerous forms: numbers, text, images, audio, video and more.
- In data science, data types are categorized as structured and unstructured.
Structured Data
- Structured data is organized in a defined format, which algorithms easily process.
- This format facilitates searching and analysis for both humans and machines.
- Structured data is suitable for quantitative analysis.
- It's often tabular, employing rows and columns that explicitly define data attributes.
Structured Data Examples
-
Databases: Information stored in relational databases, organized as tables with rows and columns.
-
Examples are: MySQL, PostgreSQL.
-
Spreadsheets: Data arranged in rows and columns in spreadsheets such as Microsoft Excel or Google Sheets.
-
CSV (Comma-Separated Values) Files: Plain text files where values are separated by commas and each line represents a new record.
-
JSON (JavaScript Object Notation) Data: Data presented using key-value pairs organized in a hierarchical structure.
-
XML (eXtensible Markup Language) Data: Data presented using tags in a hierarchical structure suitable for complex and nested relationships.
Unstructured Data
- Unstructured data lacks a predefined data model or format.
- It's less easily organized into tables than structured data.
- It comes in diverse formats like text documents, images, audio files, videos, emails and social media posts.
Unstructured Data Examples
- Text Data: Textual information from various sources such as articles, books, emails, social media posts and customer reviews.
- Image Data: Photographs, diagrams, satellite images and other visual data.
- Audio Data: Recordings of voice conversations, music, podcasts and other audio sources.
- Video Data: Moving images captured by cameras or videos.
- Social Media Feeds: Data from social media platforms including posts, comments and multimedia content.
- Web Pages: Content from websites, articles, blogs and forum posts.
Structured vs. Unstructured Data Comparison
- Structured data is readily displayed in rows and columns of relational databases.
- Structured data comprises numbers, dates and strings. It represents approximately 20% of enterprise data.
- Structured data utilizes less storage and offers easier management and protection using legacy solutions.
- Unstructured data can't be displayed in rows and columns as it lacks a defined format.
- Unstructured data includes images, audio, video, emails, and spreadsheets. It represents approximately 80% of enterprise data.
- Unstructured data requires more storage and is more complex to manage and protect using legacy solutions.
Big Data
- Big data is more extensive than traditional data.
- It's recognized by its diversity (text, images, audio etc.), velocity (real-time retrieval/computation), and volume (measured in tera, peta and exabytes). This massive data is dispersed across a computer network.
Quantitative vs Qualitative Data
- Quantitative data includes numerical measurements. It's expressed using numbers.
- Qualitative data relies on observations and descriptions. It's not easily expressed in numbers.
Quantitative Data
- Nature: Numerical data consisting of measurable values.
- Measurement: Using precise and standardized procedures. Units of measurement are included (e.g., kilograms).
- Analysis: Statistical methods like mean, median, standard deviation and visualization methods such as histograms, bar graphs, line graphs and scatter plots.
- Example: Age (25 years), weight (70kg), revenue ($5,000) or quantity (100 units).
Qualitative Data
- Nature: Descriptive, focusing on qualities and characteristics. Non-numerical, relying on categorical or textual information.
- Measurement: Subjective, based on insights, opinions, attitudes and often presented in a narrative format. Categorized into groups or classes.
- Analysis: Narrative analysis, involving the examination of stories, texts and other qualitative data for valuable insights.
- Example: Marital status (Married), customer feedback ("Satisfied"), or colors (red, blue, green).
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
This quiz covers the foundational concepts of data science, including the role of data, definitions, and specific types of data such as raw facts and figures. Test your understanding of how data is collected, analyzed, and utilized to derive insights and inform decisions in various applications.