Statistics PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document provides a comprehensive introduction to statistical concepts, including data presentation, sampling methods, and summation notation. It includes illustrative examples, formulas, and explanations for key concepts. The document is structured into chapters and provides examples for various statistical techniques.
Full Transcript
**Chapter 1: Basics of Statistics** - - - - **2. Sampling Methods \| Types and Techniques Explained** - - - - - - - - - - - **3. Summation Notation** - - - **Chapter 2: Data Presentation** **Data**: The text defines data as a collection of facts, inc...
**Chapter 1: Basics of Statistics** - - - - **2. Sampling Methods \| Types and Techniques Explained** - - - - - - - - - - - **3. Summation Notation** - - - **Chapter 2: Data Presentation** **Data**: The text defines data as a collection of facts, including numbers, words, measurements, observations, or descriptions. It differentiates between qualitative data (descriptive) and quantitative data (numerical). Quantitative data is further categorized into discrete data (countable, whole numbers) and continuous data (measurable, can take any value within a range). **Frequency Distribution**: It explains the concept of frequency distribution, which shows how often each value occurs in a data set. It demonstrates how to create a frequency distribution table by counting the occurrences of each value and optionally grouping values into intervals for larger datasets. **Example:** Sam played football on: Saturday Morning, Saturday Afternoon, Thursday Afternoon *The frequency was 2 on Saturday, 1 on Thursday and 3 for the whole week.* **Grouped Frequency Distribution**: This section elaborates on creating frequency distributions for data with many different values by grouping them into intervals or classes. It provides a step-by-step guide on determining group size, starting value, and constructing the grouped frequency distribution table. Example: Given : length of leaves (cm) = 9,16,13,7,8,4,18,10,17,18,9,12,5,9,9,16,1,8,17,1, 10,5,9,11,15,6,14,9,1,12,5,16, 4,16,8,15,14,17 **Step 1:** Arrange 1,1,1,4,4,5,5,5,6,7,8,8,8,9,9,9,9,9,9,10,10,11,12,12, 13,14,14,15,15,16,16,16, 16,17,17,17,18,18 Largest no. = 18 Smallest no. = 1 **Step 2**. Compute the **Range** Largest - Smallest = range **18 - 1 = 17** **Step 3. Calculate Group Size** **Group Size = Range** / No. of **Class** you want \-- 17/5 = 3.4 or 4 (rounded up) **Step 4.** Pick **starting value** The lowest value or 0 Eg. **0** ------------- ----------- Length (cm) Frequency 0-3 4-7 8-11 12-15 16-19 ------------- ----------- **Note:** ***-** From up to down is the number of classes.* *- The group size is counted from the starting no.* **Derived Frequency Distributions**: It introduces two derived frequency distributions: **Relative Frequency Distribution**: Shows the proportion (in percentage) of each class frequency to the total frequency. **Cumulative Frequency Distribution**: Represents the cumulative sum of frequencies, either in a \"less than\" or \"greater than\" manner. IMG\_256 ![IMG\_257](media/image2.jpeg) **Graphical Representation**: It emphasizes the importance of visualizing data through graphs to understand its characteristics and features. It lists various graph types, including line graphs, bar graphs, histograms, pie charts, pictographs, stem and leaf plots, dot plots, scatter plots, and more. · **Line Graphs**: Line graphs are ideal for illustrating information that is connected or exhibits change over time. By plotting data points and connecting them with lines, they effectively showcase trends, patterns, and fluctuations in data. IMG\_258 ![IMG\_259](media/image4.jpeg) **Important! Make sure to have:** \- A Title \- Vertical scale with tick marks and labels \- Horizontal scale with tick marks and labels \- Data points connected by lines · **Bar Graphs**: Bar graphs, or bar charts, utilize bars of varying heights to depict data. They are particularly suitable for comparing values across different categories, offering a clear visual representation of relative sizes or quantities. IMG\_260 · **Histograms**: Histograms resemble bar graphs but are specifically designed for grouping numbers into ranges. The height of each bar corresponds to the frequency of data falling within that range, making histograms valuable for visualizing the distribution of continuous data. ![IMG\_261](media/image6.jpeg) IMG\_262 · **Pie Charts**: Pie charts employ \"pie slices\" or sectors to illustrate the relative sizes of data components within a whole. They are effective for showcasing proportions and percentages, providing a quick understanding of the contribution of each category to the overall picture. ![IMG\_263](media/image8.jpeg) **Computation:** **Step 1.** add up all the values Eg. 4 + 1 + 6 + 4 + 5 = 20 **Step 2.** Divide each value by total No. of Values and multiply by 100 to get the percentage. Value / Total value x 100= P**ercentage** 4/20 = 0.2 x 100 = 20% and so on... **Compute the Degree per [Sector]** *(pie slice)* *Formula: Value / total value x 360*° = Sector 4/20 = 0.2 x 360° =72° · **Pictographs**: Pictographs are a visually engaging way to represent data using pictures or symbols. Each image stands for a specific quantity of items, making it easy to understand the relative amounts of different categories. However, pictographs may lack precision when representing smaller quantities or fractions of the image. IMG\_264 · **Stem-and-Leaf Plots**: Stem-and-leaf plots offer a structured way to organize and display data while retaining the original values. They split each data point into a \"stem\" (the leading digit(s)) and a \"leaf\" (the trailing digit), allowing for easy identification of individual values and overall data distribution. ![IMG\_265](media/image10.jpeg) · **Dot Plots**: Dot plots provide a simple yet effective visual representation of data distribution. Each data point is marked with a dot above its corresponding value on a number line, revealing clusters, gaps, and outliers in the data. IMG\_266 · **Scatter (XY) Plots**: Scatter plots are used to explore the relationship between two numerical variables. By plotting pairs of data points on a graph, they can reveal correlations, trends, and patterns, aiding in understanding the association between the variables. They also support interpolation (estimating values within the data range) and extrapolation (estimating values outside the data range), although caution is advised with extrapolation. ![IMG\_267](media/image12.jpeg) · **Least Squares Regression**: Least squares regression is a statistical method for finding the line of best fit that minimizes the sum of squared errors between the data points and the line. This line can be used for prediction and understanding the relationship between the variables. The method involves calculating the slope and y-intercept of the line using formulas based on the data points. Chapter 3. **1. Finding a Central Value** - - - ![](media/image14.png) - **2. How to Calculate the Mean Value** - - - - **3. How to Find the Median Value** - - - - - Example: 3, 13, 7, 5, 21, 23, 23, 40, 23, 14, 12, 56, 23, 29 **Step 1:** When we put those numbers in order we have: 3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56 **Step 2.** locate median. 3, 5, 7, 12, 13, 14, **[21, 23]**, 23, 23, 23, 29, 40, 56 There are now **fourteen numbers** and so we don\'t have just one middle number, we have **a pair of middle number**s: In this example the middle numbers are 21 and 23. Step 3. Find the value halfway between them, add them together and divide by 2: **21 + 23 = 44** Then, 44 ÷ 2 = **** *So the **Median** in this example is **22*** **4. How to Calculate the Mode or Modal Value** - - ![](media/image17.png) - - **7. Weighted Mean** - ![](media/image19.png) - - - - ![](media/image21.png) ![](media/image23.png) **9. Mean, Median and Mode from Grouped Frequencies** - - - - - - **12. Quartiles** - - 1. 2. 3. 4. - - **13. Percentiles, Deciles, and Quartiles** - - - - \- To calculate percentiles of height: have the data in height order (sorted by height). \- To calculate percentiles of age: have the data in age order. - - Eg. You got B! **A** = 85 **B** = 30% **C** = 50% **D** = 12% (Ranking) **Step 1.** Add up all except the highest rank and take only the half of second rank. Or where is your score belongs 15% + 50% + 12% = **77 percent** **Step 2.** draw conclusion. Since ou got B. Therefore, you did as well or **better** than the 77% of the class. ![](media/image25.png) - Eg. A. Estimate number of visitors at 30^th^ percentile. 30^th^ percentile occurs when 3,000 arrive B. **14. Outliers** - - - - - - - ![](media/image27.png) **15. Measures of Central Tendency** - - - - - - - - - - - - - - - - - - - - ![](media/image29.png) 1\. What is the value of the mean of the given set of data? Your answer: 2\. What is the value of the median of the given set of data? Your answer: 3\. What is/are the value/s of the mode of the given set of data? Your answer: 4\. What are the quartiles of the given set of data as represented by Q1, Q2, and Q3? Your answer: 5\. What are the deciles of the given set of data as represented by D5, and D9? Your answer: 6\. What are the percentiles of the given set of data as represented by P50, andP90? Your answer: