Data Analytics with Excel - Past Student Notes PDF

Summary

These notes cover data analytics and Excel. Data analysis, collection, transformation, and organization of data are discussed. Excel functions like data visualization, data handling and data format are also covered.

Full Transcript

**UNIT 1: - Understanding Data Analytics And Excel, Data Import And Pre-Processing.** A. **Introduction to data analytics And Excel Basics. :** - **What is Data Analytics** : It is the process of raw data analysis to convert it into an interpretable, usable form with the capability of a...

**UNIT 1: - Understanding Data Analytics And Excel, Data Import And Pre-Processing.** A. **Introduction to data analytics And Excel Basics. :** - **What is Data Analytics** : It is the process of raw data analysis to convert it into an interpretable, usable form with the capability of actionable insights is data analytics. Data analytics is **the collection, transformation, and organization of data in order to draw conclusions**, make predictions - **What is an Excel :**  Excel is a **powerful tool for data analysis** that enables users to manipulate, analyze, and visualize large amounts of data quickly and easily. - **Understanding The Importance Of Data Analytics :** 1) Risk Management. 2)Innovation and research. 3) Customer Insights. 4) Business Intelligence. 5) Health and Predictive Analysis. - **Introduction to excel as a data analytics tool :** 1) [SpredSheet Functionality] -At its core, Excel is a spreadsheet application that allows users to organize data into rows and columns. This tabular format is ideal for storing and managing datasets, making it easy to view, manipulate, and analyze information. [2) Data Sorting and filtering] - Excel offers built-in tools for sorting and filtering data. Users can arrange information alphabetically, numerically, or based on custom criteria. Filtering allows users to focus on specific subsets of data, facilitating targeted analysis. [3) Data Visualization] - Excel supports various chart types, including bar charts, line graphs, pie charts, and more. These visualizations enhance data interpretation and help users communicate insights effectively. Excel\'s charting capabilities make it a valuable tool for creating visually appealing reports and dashboards. B. **Data Type, Formats and Basic excel functions :-** - **Data Type In excel :** 1) Numeric data type -- a) imteger b) decimal.(these include numbers with decimal point.) (basic excel function -- SUM , AVG, MIN, MAX) 2\) Text Data Type -- Text and String (AlphaNumaric characters, words or phrases.) (Functions -- concatenate , left,right,mid --extract characters from left right pr middle.) 3\) Date And Time Data Type -- Date And Time.(Represent the specific date and time) (Today, Now -- return the current date or time respectively. Datedif -- calculate the difference beteen two dates in various unit like day month year.) 4\) Boolean Data Type -- it represent either TRUE or FALSE ( function -- if AND/OR/NOT). 5\) Currency Data Type -- Accounting.(function -- SUMIF -- Add up values based on a specified condition.) 6\) Percentage Data type format - ( function -- percentage : convert a decimal to a percentage.) **D) Data Import Technique :-** - Open Excel. - Go to the \"Data\" tab on the ribbon. - Click on \"Get Data\" or \"Get External Data,\" depending on your Excel version. - Choose \"From Text\" or \"From CSV\" depending on your file format. - Navigate to and select the text or CSV file you want to import. - Follow the import wizard, specifying delimiters and data formats as needed. - Click \"Load\" to import the data into a new worksheet. **D) Data Cleaning and Transformation** **:-** 1\) Understand the need of data cleaning : a) [Data Quality] - Raw data collected from various sources may contain errors, missing values, or inconsistencies. Cleaning the data helps ensure its accuracy, completeness, and reliability. 2\) [Data Accuracy] : Inaccurate data can lead to incorrect conclusions and decision-making. Cleaning data helps to identify and correct inaccuracies, ensuring that analytical results and insights are based on reliable information. 3\) [Duplicate Record] : Duplicates can distort statistical analyses and lead to inaccurate conclusions. Identifying and removing duplicate records is a key aspect of data cleaning. **E ) Handling Missing Value In excel :-** 1. Identifying and understanding missing values :- a. Scan data first. Then you can use conditional formatting. Like blank cell or we can highlights the cell. b. Sometime missing value represent by specific code.("NA", N/A, NULL, -) c. =f1(A1= " ", "Missing", "Not Missing") 2. Method for handling missing values : - a. Calculate missing value percentage wise - Determine the percentage of missing values for each column in your dataset. This can be calculated using the formula: Percentage missing = number of missing value /total number of values \* 100 **F) Data Formatting in excel :-** 1\) Number formatting -- a\) select the cell or column you want to formatting. 2\) click right and choose option "Format Cells". 3\) choose the number tab and select the desired number format.(eg. Currency, data, percentage) 2\) Text formatting -- a\) Use the formatting options in the Excel ribbon to change font style, size, and color. b\) Use the \"Wrap Text\" option to make text wrap within a cell. 3. Cell Alignment \-- Adjust the horizontal and vertical alignment of cell contents using the alignment options in the ribbon. Merge cells if you want to combine them into a single cell. 4. Table format \-- Convert your data range into a table using \"Format as Table\" in the \"Home\" tab. This makes it easier to sort, filter, and style your data. **G) Conditional formatting in excel :-** Conditional formatting in Excel allows you to format cells based on certain criteria or conditions. Steps -- 1) Select the range : Highlight the cells or range of cells where you want to apply conditional formatting. 2\) go to home tab. 3\) click on conditional formatting -In the \"Home\" tab, find the \"Conditional Formatting\" option. It may be located in the \"Styles\" group. 4\) choose the rule - Hover over the \"Conditional Formatting\" option to see various rules. Common rules include: Greater Than, Less Than, Between, Equal To, Text that Contains, etc. 5. Set the parameter - Click on the rule you want to apply, and a dialog box will appear. Enter the parameters or values based on which the formatting will be applied. 6. Click on ok..! **H) Advance Excel Function and formulas.** : 1\) VLOOKUP FUNCTION : in Excel stands for Vertical Lookup. VLOOKUP function searches for specific values (vertically) in a column and retrieves exact or approximate matches from another column from the same row. Formula -- VLOOKUP( ). 1. Basic Statistical functions -- a. Introduction to descriptive statistics :- Descriptive statistics is a branch of statistics that deals with the collection, analysis, interpretation, presentation, and organization of data. The primary goal of descriptive statistics is to summarize and describe the main features of a dataset. It provides a way to simplify and represent large amounts of data in a meaningful and easily understandable form. b. Using basic statistical functions in excel : COUNT(), SUM(), AVERAGE(),MEDIAN(),MIN(),MAX(),STDEV(). 2. Frequency Distribution and histogram -- a. Understanding frequency distribution and their importance. - A frequency distribution is a tabular or graphical representation of data that shows the frequency of various outcomes in a dataset. It organizes data into classes or intervals and displays how many observations fall into each class. This method of summarizing data has several important uses and advantages: 1\) Organizing data - Frequency distributions help in organizing large sets of data into a more manageable and understandable form. By grouping data into classes or intervals, patterns and trends become more apparent 2\) Visual Representation - Graphical representations of frequency distributions, such as histograms or bar charts, provide a visual snapshot of the data distribution. 3\) Calculating descriptive statistics - Frequency distributions are often a precursor to calculating other descriptive statistics, such as measures of central tendency (mean, median, mode) and measures of dispersion (range, variance, standard deviation). 3. Pivot Table and Pivot charts. - A pivot table is a data processing tool used in spreadsheet programs, such as Microsoft Excel or Google Sheets, to summarize, analyze, explore, and present large amounts of data in a structured and interactive format. 1. Creating and customizing pivot tables and pivot charts for data summarization -- a. Organize your data - Ensure that your data is organized in a tabular format with headers for each column. b. Select your data -- c. Insert pivot table -- go to insert tab on the excel ribbon. then click on pivot table. d. Choose data range -- comfirm that the table/renge field in the dialog box correctly represent your selected data. e. Select location -- means choose where you want to place the pivot table. f. Click on ok button. 2. Creating pivot chart -- Samw like above 5. **Advance chart type and customization.** a. **Bubble chart -** A bubble chart is a type of scatter plot in Excel that includes three dimensions of data: two numeric axes (X and Y) and the size of the bubbles. Each point on the chart represents a set of values for these three dimensions. b. **Radar Chart** - A radar chart, also known as a spider chart or web chart, is a data visualization tool in Excel that displays multivariate data in the form of a two-dimensional chart. It is particularly useful for comparing multiple variables across different categories. c. **Treemap chart -** 6. **Data validation and data Auditing :-** a. Understanding data validation and its important - The primary purpose of data validation is to maintain data quality and reliability, preventing errors and inconsistencies that could lead to misinformation or faulty decision-making. **Importance of data validation-** 1. **Accuracy:** - 2. **Completeness:** - 3. **Consistency:** - 4. **Reliability:** - 5. **Compliance:** - 6. **Error Prevention:** - 7. **Efficiency:** - 8. **User Confidence:** - 7)What if analysis : goal seek :- What is goal seek - Goal Seek analysis is a feature commonly found in spreadsheet software like Microsoft Excel. It is a tool that helps you find the input value necessary to achieve a desired result. In other words, Goal Seek allows you to set a target for a particular formula or cell and then automatically adjusts another input value to reach that goal. **Applications of goal seek --** financial modeling, budgeting and planning,project management,marketing planning,education and research etc. **Use of goal seek to find input values to achieve specific value-** 8)What if analysis : data tables and scenario manager- What-If\" analysis refers to the process of changing the values in cells to see how those changes affect the outcome of formulas on the worksheet. Two common tools used for What-If analysis in spreadsheet software like Microsoft Excel are Data Tables and Scenarios. **Data Table:** A Data Table is a grid that shows the possible outcomes of changing two input values in a formula. It allows you to input different values into one or two variables and see the corresponding changes in the result. **How to create a Data Table:** 1. Identify the cell containing the formula you want to analyze. 2. Select the range of possible input values for one or two variables (usually arranged in a column or row). 3. Go to the \"Data\" tab in Excel, select \"What-If Analysis,\" and choose \"Data Table.\" 4. In the Data Table dialog box, specify the cell reference for the input values and the cell reference for the formula result. 5. Excel will automatically generate a table showing the different outcomes based on the input values you provided. Data Tables are particularly useful for exploring various scenarios and understanding how changes in multiple variables can impact the overall result. **Scenario Manager:** Scenario Manager is another What-If analysis tool that allows you to create and manage different scenarios for a set of input values. It\'s useful when you have multiple variables and want to compare several sets of input values against a base case. **How to use Scenario Manager:** 1. Identify the cells containing the input values and the cell containing the formula you want to analyze. 2. Go to the \"Data\" tab in Excel, select \"What-If Analysis,\" and choose \"Scenario Manager.\" 3. In the Scenario Manager dialog box, create a new scenario, give it a name, and specify the changing cells (input values). 4. Enter the desired values for the changing cells in your scenario. 5. Repeat the process to create additional scenarios. 6. You can then switch between scenarios to see the impact on the formula result.

Use Quizgecko on...
Browser
Browser