Artificial Intelligence - Data Literacy (Unit 2) PDF

Summary

This document is a unit on data literacy, part of a larger Artificial Intelligence curriculum. It covers fundamental concepts about data and its use in AI. It also includes definitions, best practices, and examples.

Full Transcript

DELHI PUBLIC SCHOOL BANGALORE EAST ARTIFICIAL INTELLIGENCE UNIT 2 - DATA LITERACY What is data literacy? Data literacy means knowing how to understand, work with, and talks about data. It's about being able to collect, analyze,...

DELHI PUBLIC SCHOOL BANGALORE EAST ARTIFICIAL INTELLIGENCE UNIT 2 - DATA LITERACY What is data literacy? Data literacy means knowing how to understand, work with, and talks about data. It's about being able to collect, analyze, and show data in ways that make sense. Every data tells a story, but we must be careful before believing the story. Data literacy is essential because it enables individuals to make informed decisions, think critically, solve problems, and innovate. Data Pyramid is made of different stages of working with data. Moving up from the bottom Data is available in a raw form. Data in this form is not very useful. Data is processed to give us information about the world. Information about the world leads to knowledge of how things are happening. Wisdom allows us to understand why things are happening in a particular way. What is Data Security and Privacy? How are they related to AI? Data Privacy Data privacy referred to as information privacy is concerned with the proper handling of sensitive data including personal data and other confidential data, such as certain financial data and intellectual property data, to meet regulatory requirements as well as protecting the confidentiality and immutability of the data. Best practices can help you ensure data privacy: Understanding what data, you have collected, how it is handled, and where it is stored. Necessary data required for a project should only be collected. User consent while data collection must be of utmost importance. Data Security Data security is the practice of protecting digital information from unauthorized access, corruption, or theft throughout its entire lifecycle. The most possible reasons why data security is more important now are: Cyber-attacks affect all people. The fast-technological changes will boom cyber attacks. 1 Data Security vs Data Privacy The following best practices can help you ensure data privacy: Knowledge about the data: Understanding what data you have, how it is handled and where it is stored, information about how it is collected and acted upon should be clear enough. Minimize Data Collection: Necessary data required for the project should only be collected. Be transparent with your data: User consent while data collection must be of utmost importance. Also, the users should also be informed where the data is collected and why and also option for users to modify or opt out of data collection. Best Practices for Cyber Security Cyber security involves protecting computers, servers, mobile devices, electronic systems, networks, and data from harmful attacks. Do’s Use strong, unique passwords with a mix of characters for each account. Activate Two-Factor Authentication (2FA) for added security. Download software from trusted sources and scan files before opening. Prioritize websites with "https://" for secure logins. Keep your browser, OS, and antivirus updated regularly. Adjust social media privacy settings for limited visibility to close contacts. Always lock your screen when away. Connect only with trusted individuals online. Use secure Wi-Fi networks. Report online bullying to a trusted adult immediately. Don’ts Avoid sharing personal info like real name or phone number. Don't send pictures to strangers or post them on social media. Don't open emails or attachments from unknown sources. Ignore suspicious requests for personal info like bank account details. Keep passwords and security questions private. Don't copy copyrighted software without permission. Avoid cyberbullying or using offensive language online. 2 Types of Data Textual Data (Qualitative Data) It is made up of words and phrases It is used for Natural Language Processing (NLP) Search queries on the internet are an example of textual data Example: “Which is a good park nearby?” Numeric Data (Quantitative Data) It is made up of numbers It is used for Statistical Data Any measurements, readings, or values would count as numeric data Example: Cricket Score, Restaurant Bill Types of Numeric data: Continuous data is numeric data that is continuous. E.g., height, weight, temperature, voltage. Discrete data is numeric data that contains only whole numbers and cannot be fractional. E.g. the number of students in the class – it can only be a whole number, not in decimals Types of Data used in three domains of AI: What is data acquisition? Data Acquisition, also known as acquiring data, refers to the procedure of gathering data. This involves searching for datasets suitable for training AI models. The process typically comprises three key steps: 3 Best Practices for Acquiring Data Data acquisition from websites Ethical concerns in data acquisition Features of Data and Data Preprocessing Usability of Data There are three primary factors determining the usability of data: 1. Structure- Defines how data is stored. 2. Cleanliness- Clean data is free from duplicates, missing values, outliers, and other anomalies that may affect its reliability and usefulness for analysis. In this particular example, duplicate values are removed after cleaning the data. 3. Accuracy- Accuracy indicates how well the data matches real-world values, ensuring reliability. Accurate data closely reflects actual values without errors, enhancing the quality and trustworthiness 4 of the dataset. In this particular example, we are comparing data gathered from measuring the length of a small box in centimeters. Data features They are the characteristics or properties of the data. They describe each piece of information in a dataset. In AI models, we need two types of features: independent and dependent. Independent features are the input to the model—they're the information we provide to make predictions. Dependent features, on the other hand, are the outputs or results of the model—they're what we're trying to predict. Data Processing ▪ Data processing helps computers understand raw data. ▪ Use of computers to perform different operations on data is included under data processing. Data Interpretation ▪ It is the process of making sense out of data that has been processed. ▪ The interpretation of data helps us answer critical questions using data. Keywords related to data Acquire Data- Acquiring data is to collect data from various data sources. Data Processing- After raw data is collected, data is processed to derive meaningful information from it. Data Analysis – Data analysis is to examine each component of the data in order to draw conclusions. Data Interpretation – It is to be able to explain what these findings/conclusions mean in a given context. Data Presentation- In this step, you select, organize, and group ideas and evidence in a logical way. Methods of Data Interpretation Based on the two types of data, there are two ways to interpret data- Quantitative Data Interpretation Quantitative data interpretation is made on numerical data. It helps us answer questions like “when,” “how many,” and “how often”. For example – (how many) numbers of likes on the Instagram post. Data Collection Methods - Quantitative Data Interpretation Interviews: Quantitative interviews play a key role in collecting information. Polls: A poll is a type of survey that asks simple questions to respondents. Polls are usually limited to one question. 5 Observations: Quantitative data can be collected through observations in a particular time period Longitudinal Studies: A type of study conducted over a long time Survey: Surveys can be conducted for a large number of people to collect quantitative data. 4 Steps to Quantitative Data Analysis 1. Relate measurement scales with variables 2. Connect descriptive statistics with data 3. Decide a measurement scale 4. Represent data in an appropriate format Qualitative Data Interpretation Qualitative data tells us about the emotions and feelings of people. Qualitative data interpretation is focused on insights and motivations of people Data Collection Methods - Qualitative Data Interpretation Record keeping: This method uses existing reliable documents and other similar sources of information as the data source. It is similar to going to a library. Observation: In this method, the participants – their behavior and emotions – are observed carefully Case Studies: In this method, data is collected from case studies. Focus groups: In this method, data is collected from a group discussion on relevant topic. Longitudinal Studies: This data collection method is performed on the same data source repeatedly over an extended period. One-to-One Interviews: In this method, data is collected using a one-to-one interview. 5 Steps to Qualitative Data Analysis 1. Collect Data 2. Organize 3. Set a code to the Data Collected 4. Analyze your data 5. Reporting Qualitative v/s Quantitative data analysis Types of Data Interpretation 1. Textual DI The data is mentioned in the text form, usually in a paragraph. Used when the data is not large and can be easily comprehended by reading. Textual presentation is not suitable for large data. 2. Tabular DI Data is represented systematically in the form of rows and columns. Title of the Table (Item of Expenditure) contains the description of the table content. 6 Column Headings (Year; Salary; Fuel and Transport; Bonus; Interest on Loans; Taxes) contains the description of information contained in columns. 3. Graphical DI -Bar Graphs Data is represented using vertical and horizontal bars. - Pie Charts Pie Charts have the shape of a pie and each slice of the pie represents the portion of the entire pie allocated to each category It is a circular chart divided into various sections (think of a cake cut into slices) Each section of the pie chart is proportional to the corresponding value -Line Graphs A line graph is created by connecting various data points. It shows the change in quantity over time. Importance of Data Interpretation 7 DELHI PUBLIC SCHOOL BANGALORE EAST ARTIFICIAL INTELLIGENCE Unit 2 - Data Literacy Q1) Multiple choice questions: 1. Cultivating Data Literacy means: a) Utilize vocabulary and analytical skills b) Acquire, develop, and improve data literacy skills c) Develop skills in statistical methodologies d) Develop skills in Math 2. Data Privacy and Data Security are often used interchangeably but they are different from each other a) True b) False 3. The_____________________ provides guidance on using data efficiently and with all levels of awareness. a) data security framework b) data literacy framework c) data privacy framework d) data acquisition framework 4. _____________ allows us to understand why things are happening in a particular way a) data b) information c) knowledge d) wisdom 5.__________ is the practice of protecting digital information from unauthorized access, corruption, or theft throughout its entire lifecycle. a) data security b) data literacy c) data privacy d) data acquisition 6. means knowing how to understand, work with, and talks about data. a) Data b) Literacy c)Data Literacy d)None of the above 7. Which among these is not a type of data interpretation? a) Textual b) Tabular c) Graphical d)Raw data 8. _____________ relates to the manipulation of data to produce meaningful insights. a) Data Processing b) Data Interpretation c) Data Analysis d) Data Presentation Q2) Match the following: 1. “Which is a good park nearby?” a. Pie Chart 2. Cricket Score b. Examine each component of the data in order to draw conclusions. 3. Graphical DI c. Select, organize, and group ideas and evidence in a logical way. 4. Data Analysis d. Quantitative Data 5. Data Presentation e. Qualitative Data Answers: 1-d, 2-e, 3-a, 4-b, 5-c 8 Q3) With a neat labelled Data Pyramid diagram Q4) Answer the following: 1. Explain the data literacy process framework. The data literacy framework provides guidance on using data efficiently and with all levels of awareness. Data literacy framework is an iterative process. 2. Mention any 4 do’s for protecting computers, servers, mobile devices, electronic systems, networks, and data from harmful attacks. Use strong, unique passwords with a mix of characters for each account. Activate Two-Factor Authentication (2FA) for added security. Download software from trusted sources and scan files before opening. Prioritize websites with "https://" for secure logins. Keep your browser, OS, and antivirus updated regularly. Adjust social media privacy settings for limited visibility to close contacts. Always lock your screen when away. Connect only with trusted individuals online. Use secure Wi-Fi networks. Report online bullying to a trusted adult immediately. 3. Explain the types of numeric data. Numeric Data is further classified as: Continuous data is numeric data that is continuous. E.g., height, weight, temperature, voltage 9 Discrete data is numeric data that contains only whole numbers and cannot be fractional E.g. the number of students in the class – it can only be a whole number, not in decimals 4. Differentiate between Qualitative and Quantitative data analysis. ********** 10

Use Quizgecko on...
Browser
Browser