Math 403 Engineering Data Analysis Lecture PDF

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Document Details

PreferableDahlia9184

Uploaded by PreferableDahlia9184

Engr. Iris R. Tejada

Tags

engineering data analysis statistics data collection

Summary

This lecture covers introductory topics in engineering data analysis, focusing on data collection methods and a variety of statistical concepts. It includes definitions of important terms like population, samples, and variables. The lecture also covers details on planning and conducting surveys and experiments.

Full Transcript

MATH OBTAINING 403 DATA ENGINEERING ENGR. IRIS R. TEJADA DATA ANALAYSIS Introduction Descriptive Statistics Statistics deals with the procedures defined as the scienc...

MATH OBTAINING 403 DATA ENGINEERING ENGR. IRIS R. TEJADA DATA ANALAYSIS Introduction Descriptive Statistics Statistics deals with the procedures defined as the science that deals with the collection, organization, that organize, summarize presentation, analysis, and and describe quantitative interpretation of data in order be data. It seeks merely to able to draw judgments or describe data. conclusions that help in the decision-making process Descriptive Statistics Inferential Statistics Inferential Statistics deals with making a judgment or a conclusion about a population based on the findings from a sample that is taken from the population https://www.youtube.com/watch?v=0VDafmUys04 Intended Learning Outcomes At the end of this module, it is expected that the students will be able to: 1. Demonstrate an understanding of the different methods of obtaining data. 2. Explain the procedures in planning and conducting surveys and experiments. Statistical Terms (Before proceeding to the discussion of the different methods of obtaining data, let us have first definition of some statistical terms: ) Population or Universe refers to the totality of objects, persons, places, things used in a particular study. All members of a particular group of objects (items) or people (individual), etc. which are subjects or respondents of a study. Sample is any subset of population or few members of a population. Data are facts, figures and information collected on some characteristics of a population or sample. These can be classified as qualitative or quantitative data. Ungrouped (or raw) data Grouped Data are data which are not are raw data organized organized in any specific way. into groups or categories They are simply the collection with corresponding of data as they are gathered. frequencies. Organized in this manner, the data is referred to as frequency distribution. Parameter is the descriptive measure of a characteristic of a population/ whole. Statistic is a measure of a characteristic of sample Constant is a characteristic or property of a population or sample which is common to all members of the group. Variable A variable is any characteristics, number, or quantity that can be measured or counted. A variable may also be called a data item. Age, sex, business income and expenses, country of birth, capital expenditure, class grades, eye colour and vehicle type are examples of variables. Methods on Obtaining Data I. Methods of Data Collection II. Planning and Conducting Surveys III. Planning and Conducting Experiments I. Methods of Data Collection Collection of the data is the first step in conducting statistical inquiry. It simply refers to the data gathering, a systematic method of collecting and measuring data from different sources of information in order to provide answers to relevant questions. This involves acquiring information published literature, surveys through questionnaires or interviews, experimentations, documents and records, tests or examinations and other forms of data gathering instruments. Investigator Enumerator Respondent person who conducts the the one who helps in collecting information is collected inquiry information from them DATA Tell whether if it is a Primary or Secondary Data Primary Secondary Raw data According to Wessel, “Data Secondary data, on the other biographies collected in the process of hand, is collected by some other dictionary investigation are known as organization for their own use diary surveys primary data.” These are but the investigator also gets it photographs collected for the for his use. According to M.M. Blair, “Secondary data are those Tax records books investigator’s use from the already in existence for some primary source. other purpose than answering dissertations Reports the question in hand.” experiments letters questionnaire interview Internet articles Political Journals commentary In Engineering, there are three basic methods of collecting data retrospective study observational study Experiments designed In engineering, there are problem would use the population or the researchers only areas with no scientific or sample of the historical data observe the subjects and engineering theory that are which had been archived over do not interfere or try to directly or completely applicable, some period of time. influence the outcomes so experimentation and observation of the resulting data is the only way to solve them. An example of an experiment is when scientists give rats a new medicine and see how they react to learn about the medicine. An example of an experiment is when you try a new coffee shop but you aren't sure how the coffee will taste. The result of experimentation. II. Planning and Conducting Surveys Advantages of face-to-face interviews include fewer Face to face misunderstood questions, fewer incomplete responses, higher response rates, and greater control over the environment in which the survey is administered; also, Survey the researcher can collect additional information if any of the respondents’ answers need clarifying is a method of asking The disadvantages of face-to-face interviews are that respondents some well- they can be expensive and time-consuming and may constructed questions. It is require a large staff of trained interviewers. In addition, an efficient way of the response can be biased by the appearance or attitude collecting information and of the interviewer. easy to administer Less expensive than interviews. wherein a wide variety of Self- It can be administered in large numbers and does not information can be administer require many interviewers and there is less pressure on collected. respondents. The respondents are more likely to stop participating mid-way through the survey and respondents cannot ask to clarify their answers. There are lower response rates than in personal interviews When designing a survey, the following steps are useful: 1. Determine the objectives of your survey: What questions do you want to answer? 2. Identify the target population sample: Whom will you interview? Who will be the respondents? What sampling method will you use? 3. Choose an interviewing method: face-to-face interview, phone interview, self- administered paper survey/internet survey. 4. Decide what questions you will ask in what order, and how to phrase them. 5. Conduct the interview and collect the information. 6. Analyze the results by making graphs and drawing conclusions. In choosing the respondents, sampling techniques are necessary. Sampling Sampling is the process of selecting units (e.g., people, organizations) from a population of interest Sample must be a representative of the target population. The target population is the entire group a researcher is interested in; the group about which the researcher wishes to draw conclusions. Two ways of selecting a sample. Probability sampling Non-probability sampling Probability sampling is defined as a sampling It is also called judgment or subjective sampling. This technique in which the researcher chooses samples method is convenient and economical but the inferences from a larger population using a method based on the made based on the findings are not so reliable theory of probability. For a participant to be It is a sampling method in which not all members of the considered as a probability sample, he/she must be population have an equal chance of participating in the selected using a random selection. study, unlike probability sampling. Each member of the The most critical requirement of probability sampling population has a known chance of being selected. Non- is that everyone in your population has a known and probability sampling is most useful for exploratory studies equal chance of getting selected. like a pilot survey (deploying a survey to a smaller sample For example, if you have a population of 100 people, every compared to pre-determined sample size). Researchers use person would have odds of 1 in 100 for getting selected. Probability sampling gives you the best chance to create a this method in studies where it is impossible to draw sample that is truly representative of the population. random probability sampling due to time or cost considerations. Convenience Sampling Non-probability sampling The researcher use a device in obtaining the information from the respondents which favors the researcher but can cause bias to the respondents. Convenience It means collecting a sample of whichever Sampling participants are easiest to reach Purposive Sampling Quota Sampling. Purposive Sampling Non-probability sampling The selection of respondents is predetermined according to the characteristic of interest made by the researcher. Randomization is absent in this type of sampling. Convenience The participants are selected based on the Sampling purpose of the sample, hence the name. Participants are selected according to the needs of the study (hence the alternate name, deliberate sampling); applicants who do not Purposive meet the profile are rejected. Sampling Quota Sampling. Quota Sampling. Non-probability sampling Proportional Non Proportional 20 employees In proportional quota Non-proportional quota 40% SM_8 sampling is a bit less Convenience sampling the major 30% CS_6 restrictive. In this method, a characteristics of the Sampling 20% IT_4 minimum number of 10% Finance_2 population by sampling a proportional amount of each sampled units in each is represented. category is specified and 7000male 70% not concerned with having Purposive For example, imagine you want to create a numbers that match the 3000 fmale30% council of 20 employees that will meet and Sampling recommend possible changes to the proportions in the employee handbook. Let's say 40% of your employees are in Sales and Marketing, population. 30% in Customer Service, 20% of your employees are in IT, and 10% in Finance. You will randomly select 8 people from Quota Sales and Marketing, 6 from Customer Service, 4 from IT, and 2 from Finance. As Sampling. you can see, each number you pick is proportionate to the overall percentage of people in each category (e.g., 40% = 8 people). Simple Random Sampling Simple random sampling is the basic sampling Probability Sampling technique where a group of subjects (a sample) is selected for study from a larger group (a population). Each individual is chosen entirely by chance and each member of the population Simple Random has an equal chance of being included in the Sampling sample. Every possible sample of a given size has the same chance of selection; i.e. each member of the population is equally likely to be chosen at any stage in the sampling process. Stratified Sampling Cluster Sampling. Stratified Sampling A stratified sample is obtained by taking samples Probability Sampling from each stratum or sub-group of a population. When a sample is to be taken from a population with several strata, the proportion of each stratum in the sample should be the same as in the population Simple Random Sampling For example, you have three sub-groups with a population size of 150, 200, 250 subjects in each Stratified subgroup respectively. Now, to make it Sampling proportionate, the researcher uses one specific fraction or a percentage to be applied on its subgroups of population. The sample for first group would be 150*0.5= 75, 200*0.5=100 and Cluster 250*0.5= 125. Here the constant factor is the Sampling. proportion ration for each population subset. Cluster Sampling Cluster sampling is a sampling technique where the entire population is Probability Sampling divided into groups, or clusters, and a random sample of these clusters are selected. All observations in the selected clusters are included in the sample. In cluster sampling, researchers divide a population into smaller groups known as clusters. They then randomly select among these clusters to Simple Random form a sample. Cluster sampling is often used to study large populations, particularly Sampling those that are widely geographically dispersed. Researchers usually use pre-existing units such as schools or cities as their clusters. Stratified Sampling Cluster Sampling III. Planning and Conducting Experiments Experiment is a series of tests conducted in a systematic manner to increase the understanding of an existing process or to explore a new product or process Design of Experiments, or DOE is a tool to develop an experimentation strategy that maximizes learning using minimum resources. Design of Experiments is widely and extensively used by engineers and scientists in improving existing process through maximizing the yield and decreasing the variability or in developing new products and processes. It is a technique needed to identify the "vital few" factors in the most efficient manner and then directs the process to its best setting to meet the ever-increasing demand for improved quality and increased productivity. Methodology of DOE ensures that all factors and their interactions are systematically investigated resulting to reliable and complete information Planning Five stages Screening of Optimization Methodology of DOE robustness testing Verification 1. Planning It is important to carefully plan for the course of experimentation before embarking upon the process of testing and data collection. At this stage, identification of the objectives of conducting the experiment or investigation, assessment of time and available resources to achieve the objectives. Individuals from different disciplines related to the product or process should compose a team who will conduct the investigation. Well planned experiments are easy to execute and analyze using the available statistical software. 2. Screening Screening experiments are used to identify the important factors that affect the process under investigation out of the large pool of potential factors. Screening process eliminates unimportant factors and attention is focused on the key factors. Screening experiments are usually efficient designs which require few executions and focus on the vital factors and not on interactions. 3. Optimization After narrowing down the important factors affecting the process, then determine the best setting of these factors to achieve the objectives of the investigation. The objectives may be to either increase yield or decrease variability or to find settings that achieve both at the same time depending on the product or process under investigation It is an act, process, or methodology of making something (such as a design, system, or decision) as fully perfect, functional, or effective as possible specifically It is the mathematical procedures (such as finding the maximum of a function) involved in this. 4. Robustness Testing A robust statistic is resistant to errors in the results. Once the optimal settings of the factors have been determined, it is important to make the product or process insensitive to variations resulting from changes in factors that affect the process but are beyond the control of the analyst. Such factors are referred to as noise or uncontrollable factors that are likely to be experienced in the application environment. It is important to identify such sources of variation and take measures to ensure that the product or process is made robust or insensitive to these factors. 5. Verification A process in which different types of data are checked for accuracy and inconsistencies after data migration is done. This final stage involves validation of the optimum settings by conducting a few follow-up experimental runs. This is to confirm that the process functions as expected and all objectives are achieved. REFERENCES: Montgomery, Douglas C.,et al., Applied Statistics and Probabiliy for Engineers, 7th ed., John Wiley & Sons (Asia) Pte Ltd, 2018 Panopio, Felix M. (2004). Statistics with Probability. Batangas City, Philippines: Feliber Publishing House Rawley, Eve. Planning and Conducting Surveys. https://www.ck12.org/statistics/planning-and-conducting- surveys/lesson/Planning-and-Conducting-Surveys-ALG-I/ Date accessed: July 27, 2020 Walpole, Ronald E., et al., Probability and Statistics for Engineers and Scientists, 9th ed., Pearson Education Inc., 2016 Introduction to Design of Experiments. https://www.weibull.com/hotwire/issue84/hottopics84.htm. Date Accessed: April 15, 2020 https://mathspace.co/learn/world-of-maths/language-and-use-of-statistics/planning-a-statistical-investigation- i-investigation-18643/investigation-statistical-inquiry-916/ ACTIVITY As one of the students of EDA class, you are tasked to conduct a survey to show which extracurricular activities the students from the College of Engineering, Architecture and Fine Arts would like to engage in during the first semester. Follow the presented steps in conducting a survey.(Steps are in slide #12)

Use Quizgecko on...
Browser
Browser