DSA.pdf
Document Details
Uploaded by Deleted User
Tags
Full Transcript
CHAPTER 1 Learning Module in IT Inst 3 – DATA SCIENCE ANALYTICS INTENDED LEARNING OUTCOMES: At the end of the lesson, the students are expected to: Differentiate Data Science and Data Anal...
CHAPTER 1 Learning Module in IT Inst 3 – DATA SCIENCE ANALYTICS INTENDED LEARNING OUTCOMES: At the end of the lesson, the students are expected to: Differentiate Data Science and Data Analytics; Explain the data analytics process; and Demonstrate proficiency in fundamental Excel functions such as data entry, formatting, formula creation, and use of Excel functions LESSON: I. Overview of Data, Data Science Analytics, and Tools Data Science vs. Data Analytics These two terms often referred in the same context, but they do have different definitions. The objective of both terms is the same, to extract meaningful insights from data to strengthen business decision making. The main difference lies in the tactics each uses to achieve this. Data Analysts examine large datasets to identify trends, forecast and data visualizations to tell a compelling story through actionable insights. These insights help stakeholders make informed decisions according to business needs. Data Scientists are tasked with designing and constructing new processes for data modeling using algorithms, predictive analytics and statistical analysis. Data Scientists have the technical skills to arrange unstructured data and build their own methodologies and frameworks. There are key differences between the two roles, but they share the same goal: To translate data analysis into business intelligence. 1 Data Analytics reviews raw data and drawing meaningful insights to solve business problems. Types of Data Analytics Descriptive analytics Answers the question “What has happened in the past and what is happening right now?” by providing a current snapshot of trends and patterns by leveraging current and historical data. Diagnostic analytics Answers the question “Why are these trends and patterns happening? By focusing on the trend data to discover the factors or reasons for past performance. Predictive analytics Answers the question “What is likely to happen in the future?” by utilizing machine learning and artificial intelligence (AI) to build predictive models and statistical models to predict the future. Prescriptive analytics Answers the question “What do we need to do?” through testing and other techniques to recommend specific solutions that will drive a desired outcome. 2 These types of data analytics are performed using a variety of tools and techniques that vary based on the type of analysis and objective. Data Science focuses on building models and designing frameworks that will gather and analyze data. Typically, data science includes data mining, statistical methods and machine learning algorithms. Unstructured Data Unorganized and unusable until it is processed. Data scientists are charged with cleaning this data and processing it. They rely on classification, categorization and sentence chunking to make sense of unstructured data. Statistical Methods Once the data is collected there can be many variables to consider. Regression analysis is one statistical method that allows data scientists to explore the relationship between these variables. Correlation analysis is also used for both qualitative and quantitative data. Machine Learning Algorithms Data scientists use machine learning algorithms to predict, categorize and classify data with minimal chance for error. There are three main sets of machine learning algorithms: - Supervised - Unsupervised - Reinforcement learning Again, there are many techniques and models that data scientists use to find the right data. These are simply a few of the most common. Data Analytics Process As the data available to companies continues to grow both in amount and complexity, so too does the need for an effective and efficient process by which to harness the value of that data. The data analysis process typically moves through several iterative phases. Let’s take a closer look at each. 1. Identify the business question you’d like to answer. What problem is the company trying to solve? What do you need to measure, and how will you measure it? 2. Collect the raw data sets you’ll need to help you answer the identified question. Data collection might come from internal sources, like a company’s client relationship management (CRM) software, or from secondary sources, like government records or social media application programming interfaces (APIs). 3. Clean the data to prepare it for analysis. This often involves purging duplicate and anomalous data, reconciling inconsistencies, standardizing data structure and format, and dealing with white spaces and other syntax errors. 4. Analyze the data. By manipulating the data using various data analysis techniques and tools, you can begin to find trends, correlations, outliers, and variations that tell a story. During this stage, you might use data mining to discover patterns within 3 databases or data visualization software to help transform data into an easy-to- understand graphical format. 5. Interpret the results of your analysis to see how well the data answered your original question. What recommendations can you make based on the data? What are the limitations to your conclusions? RECOMMENDED LEARNING MATERIAL: https://www.comptia.org/content/guides/what-is-data-analytics https://www.comptia.org/content/guides/data-analytics-vs-data-science ASSESSMENT TASK (20 points): 1. Illustrate the similarities and differences between Data Science and Data Analytics. Explain the illustration briefly. 2. Create an illustration of the Data Analytics process. Discuss your illustration briefly. 4