Summary

This document is a lecture on Introduction to Statistics and Data Analysis for the first semester of the academic year 2024-2025. It includes topics such as scientific data, inferential statistics, and data collection methods.

Full Transcript

INTRODUCTION TO STATISTICS AND DATA ANALYSIS ENGR. LEONEL C. TINGSON, RCHE, RCHT, SO2 FIRST SEMESTER, A.Y 2024-2025 MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image req...

INTRODUCTION TO STATISTICS AND DATA ANALYSIS ENGR. LEONEL C. TINGSON, RCHE, RCHT, SO2 FIRST SEMESTER, A.Y 2024-2025 MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. INTRODUCTION TO STATISTICS AND DATA ANALYSIS ENGR. LEONEL C. TINGSON, RCHE, RCHT, SO2 FIRST SEMESTER, A.Y 2024-2025 MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. SCIENTIFIC DATA Statistical methods are used to analyze data from a process to gain more sense of where in the process changes may be made to improve the quality of the process. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. INFERENTIAL STATISTICS Involves using data from the sample to make interferences or prediction about a larger population MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. SCIENTIFIC DATA These statistical methods are designed to contribute to the process of making scientific judgments in the face of uncertainty and variation. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Variability in Scientific Data If the observed product density in the process were always the same and were always on target, there would be no need for statistical methods. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Population and Sample MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Observational and Experimental MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Observational and Experimental MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Descriptive Statistics MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Descriptive Statistics MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. STATISTICAL TERMS Data are facts, figures and Ungrouped (or raw) data are information collected on some data which are not organized in characteristics of a population any specific way. They are simply or sample. These can be classified the collection of data as they are as qualitative or quantitative data gathered. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. STATISTICAL TERMS Constant is a characteristic or Statistic is a measure of a property of a population or characteristic of sample sample which is common to all members of the group. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. STATISTICAL TERMS Grouped Data are raw data Parameter is the descriptive organized into groups or measure of a characteristic of a categories with corresponding population frequencies. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. STATISTICAL TERMS Variable is a measure or characteristic or property of a population or sample that may have a number of different values. It differentiates a particular member from the rest of the group. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. ROLES OF PROBABILITY Elements of probability allow us to quantify the strength or “confidence” in our conclusions. In this sense, concepts in probability form a major component that supplements statistical methods and helps us gauge the strength of the statistical inference. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Probability and Statistical Inference The sample along with inferential statistics allows us to draw conclusions about the population, with inferential statistics making clear use of elements of probability. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Probability and Statistical Inference MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Probability and Statistical Inference Pedadogy-wise, probability is essential in inferential statistics because it quantifies the uncertainty in drawing conclusions about a population from a sample, enabling us to estimate the likelihood of various outcomes and make informed decisions. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Methods of Data Collection Collection of the data is the first step in conducting statistical inquiry. It simply refers to the data gathering, a systematic method of collecting and measuring data from different sources of information in order to provide answers to relevant questions. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Methods of Data Collection Data can be primary or secondary. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Methods of Data Collection In the field of engineering, the three basic methods of collecting data are through retrospective study, observational study and through a designed experiment. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Methods of Data Collection Retrospective Study A retrospective study would use the population or sample of the historical data which had been archived over some period of time. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Methods of Data Collection Observational study Process or population is observed and disturbed as little as possible, and the quantities of interests are recorded MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Methods of Data Collection Design Experiment Designed experiments are very important in engineering design and development and in the improvement of manufacturing processes in which statistical thinking and statistical methods play an important role in planning, conducting, and analyzing the data. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Methods of Data Collection Design Experiment Designed experiments are very important in engineering design and development and in the improvement of manufacturing processes in which statistical thinking and statistical methods play an important role in planning, conducting, and analyzing the data. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Planning and Conducting Surveys Surveys depend on the respondents honesty, motivation, memory and his ability to respond. Sometimes answers may lead to vague data. Surveys can be done through face- to-face interviews or self- administered through the use of questionnaires. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Steps in Designing a Survey 1. Determine the objectives of your survey 2. Identify the target population sample 3. Choose an interviewing method 4. Decide what questions you will ask in what order, and how to phrase them. 5. Conduct the interview and collect the information. 6. Analyze the results by making graphs and drawing conclusions. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. TYPES OF SAMPLING Sampling strategies in research vary widely across different disciplines and research areas, and from study to study. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Probability Sampling Methods Simple random sampling Every element in the population has an equal chance of being selected as part of the sample. It’s something like picking a name out of a hat. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Probability Sampling Methods Systematic Sampling With systematic sampling the random selection only applies to the first item chosen. A rule then applies so that every nth item or person after that is picked. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Probability Sampling Methods Stratified Sampling Stratified sampling involves random selection within predefined groups. It’s a useful method for researchers wanting to determine what aspects of a sample are highly correlated with what’s being measured. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Probability Sampling Methods Cluster Sampling With cluster sampling, groups rather than individual units of the target population are selected at random for the sample. These might be pre-existing groups, such as people in certain zip codes or students belonging to an academic year. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Non Probability Sampling Methods Convenience Sampling People or elements in a sample are selected on the basis of their accessibility and availability. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Non Probability Sampling Methods Quota sampling This approach aims to achieve a spread across the target population by specifying who should be recruited for a survey according to certain groups or criteria. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Non Probability Sampling Methods Purposive sampling Participants for the sample are chosen consciously by researchers based on their knowledge and understanding of the research question at hand or their goals. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Non Probability Sampling Methods Snowball or Referral Sampling People recruited to be part of a sample are asked to invite those they know to take part, who are then asked to invite their friends and family and so on. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. What Type of Sampling should I use? Define your research goals Assess the nature of your population MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. What Type of Sampling should I use? Consider your constraints Determine the reach of your findings MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. What Type of Sampling should I use? Get feedback MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Purpose of Statistical Analysis When we conduct a study and measure the dependent variable, we are left with sets of numbers. Those numbers inevitably are not the same. That is, there is variability in the numbers. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Statistical Analysis Descriptive Statistics Measures of central tendency such as the mean, median, and mode summarize the performance level of a group of scores, and measures of variability describe the spread of scores among participants. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Measures of Central Tendency The Mean Known as the arithmetic average, consists of the sum of all scores divided by the number of scores. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Measures of Central Tendency The Median To find the median, you arrange the values of the variable in order—either ascending or descending—and then count down (n + 1) / 2 scores. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Measures of Central Tendency The Mean vs Median DATA Clearly, the mean is influenced 1.7 considerably by the presence 2.2 of the extreme observation, 3.9 14.7, whereas the median 3.11 places emphasis on the true 14.7 “center” of the data set. MEAN : 5.12 MEDIAN : 3.9 MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Measures of Central Tendency The Mean vs Median It may be of interest to the reader with an engineering background that the sample mean is the centroid of the data in a sample. In a sense, it is the point at which a fulcrum can be placed to balance a system of “weights” which are the locations of the individual data. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Measures of Central Tendency The Mode A rarely used measure of central tendency, the mode simply represents the most frequent score in a distribution. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Measures of Central Tendency The Mode A rarely used measure of central tendency, the mode simply represents the most frequent score in a distribution. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Measures of Variability The Standard Deviation Standard deviation is a measure of the dispersion of data points in a dataset, indicating how much the values typically differ from the mean MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Measures of Variability The Variance Variance is a statistical measure that represents the average of the squared differences between each data point and the mean, indicating the spread of data in a dataset. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. INTRODUCTION TO DESIGN EXPERIMENTS Planning and Conducting Experiments MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Introduction to Design of Experiments Design of Experiments, or DOE, is a tool to develop an experimentation strategy that maximizes learning using minimum resources. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Introduction to Design of Experiments PLANNING Identification of the objectives of conducting the experiment or investigation, assessment of time and available resources to achieve the objectives. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Introduction to Design of Experiments SCREENING Screening experiments are used to identify the important factors that affect the process under investigation out of the large pool of potential factors. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Introduction to Design of Experiments OPTIMIZATION The objectives may be to either increase yield or decrease variability or to find settings that achieve both at the same time depending on the product or process under investigation MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Introduction to Design of Experiments ROBUSTNESS TESTING It is important to identify sources of variation and take measures to ensure that the product or process is made robust or insensitive to these factors. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent. Introduction to Design of Experiments VERIFICATION This final stage involves validation of the optimum settings by conducting a few follow-up experimental runs. This is to confirm that the process functions as expected and all objectives are achieved. MATH403: Engineering Data Analysis. Copyright 2024 Leonel C. Tingson. All rights reserved. All trademarks, images and symbols belongs to it’s respective owners. Usage of any image requires prior notification and consent.

Use Quizgecko on...
Browser
Browser