STA 111 Descriptive Statistics PDF

Document Details

UncomplicatedAstronomy

Uploaded by UncomplicatedAstronomy

Pan-Atlantic University

Tags

Descriptive Statistics Statistics Data Analysis Basic Concepts

Summary

This document provides an overview of descriptive statistics, covering topics like expected outcomes, defining statistics, areas of statistical study, the role of statistics, and applications in different fields like software engineering. It also explains basic concepts like variables, populations, and samples.

Full Transcript

STA 111 DESCRIPTIVE STATISTICS (Basic Concepts in Statistics ) Expected outcomes At the end of the lecture, the students should be able to: Define Statistics as a course of study or body of knowledge State the role of Statistics in different fields of study and Engi...

STA 111 DESCRIPTIVE STATISTICS (Basic Concepts in Statistics ) Expected outcomes At the end of the lecture, the students should be able to: Define Statistics as a course of study or body of knowledge State the role of Statistics in different fields of study and Engineering in particular Define some terms used in the study of Statistics Classify Variables. 2 Definition of Statistics Statistics, as a course of study or body of knowledge, can be defined as a branch of science that deals with collection, presentation, analysis and interpretation of data. It is the scientific process for making valid decisions in the face of uncertainty. It is the Science of processing data. 3 Areas of Statistical Study There are two broad area of Statistics study; Descriptive Statistics and Inferential Statistics. Descriptive Statistics: Is concerned with methods for presenting and summarizing sample data. Also referred to as exploratory data analysis (EDA) Inferential Statistics: Is concerned with methods of using the summary information and findings from sample to draw conclusion on the population. Can lead to confirmatory data analysis (CDA). 4 Role of Statistics Statistics put much importance on the analyses of data, which helps incorporate theory into solving problems of uncertainty. These theories inform the methods to help establish scientific foundations to problems and their solutions. Statistics play key roles in the state economy, by providing summary measures of economic variables and acts as a management tool. In Health, Energy, Environmental Studies, Government, Telecommunication, Transportation, etc, Statistical findings help to release the right information needed in policy formation and decision making. Statistics is increasingly used in risk assessment and dynamics. Statistics is also used in Control Theory and so it is essential for every scientist to master these tools. 5 Some Roles of Statistics in Software Engineering and Computer Science Probability and statistics are used throughout engineering to analyze data. Statistical methods are used in developing and implementing data-driven technologies. When data are generated in software cycle, statistical methods are use to describe, estimate and make predictions. Statistics methods provide frameworks that helps in identifying trends and patterns in data and these are useful in business decisions Data science techniques like machine learning and Artificial intelligence rely on statistical tools for analysing and implementing big data. Engineers and computer scientists use probability in product and system designs. Combining Statistics knowledge and computer science creates increasing cutting-edge opportunities for many in the world of today. Engineers use probability and statistics to assess experimental data and control and improve processes. It is essential for today’s engineer to master these tools. 6 Some Basic Concepts in Statistics Observations Population Sample Variable 7 Population, Sample and Observations The units on which we measure data—such as persons, cars, animals, or plants— are called observations. The collection of all units is called population. If we consider a selection of observations, then these observations are called sample. A sample is always a subset of the population. 8 Variables If we have specified the population of interest for a specific research question, we can think of what is of interest about our observations. A particular feature of these observations can be collected in a statistical variable X. Any information we are interested in may be captured in such a variable. For example, if our observations refer to human beings, X may describe marital status, gender, age, or anything else which may relate to a person. Of course, we can be interested in many different features, each of them collected in a different variable Xi,i = 1, 2,..., p. Each observation takes a particular value for X. If X refers to gender, each observation, i.e. each person, has a particular value x which refers to either “male” or “female”. 9 Examples If X refers to gender, possible x-values are contained in S = {male,female}. Each observation is either male or female, and this information is summarized in X. Let X be the country of origin for a car. Possible values to be taken by an observation(i.e. a car) are S = {Italy, South Korea, Germany, France,India,China,Japan, USA,...}. A variable X which refers to age may take any value between 1 and 125. Each person is assigned a value x which represents the age of this person 10 Qualitative and Quantitative Variables Types of variables are; Qualitative variables are the variables which take values x that cannot be ordered in a logical or natural way. For example, the colour of the eye, the name of a political party the type of transport used to travel to work or to school are all qualitative variables. Neither is there any reason to list blue eyes before brown eyes (or vice versa) nor does it make sense to list buses before trains (or vice versa). Quantitative variables represent measurable quantities. The values which these variables can take can be ordered in a logical and natural way. Examples of quantitative variables are size of shoes price for houses number of semesters studied weight of a person 11 Discrete and Continuous Variables Discrete variables are variables which can only take a finite number of values. All qualitative variables are discrete, such as the colour of the eye or the region of a country. But also quantitative variables can be discrete: the size of shoes or the number of semesters studied would be discrete because the number of values these variables can take is limited. Variables which can take an infinite number of values are called continuous variables. Examples are the time it takes to travel to university, the length of an antelope, and the distance between two planets. Sometimes, it is said that continuous variables are variables which are “measured rather than counted”. This is a rather informal definition which helps to understand the difference between discrete and continuous variables 12 Scales The thoughts and considerations from above indicate that different variables contain different amounts of information. A useful classification of these considerations is given by the concept of the scale of a variable. Nominal scale:The values of a nominal variable cannot be ordered. Examples are the gender of a person (male–female) or the status of an application (pending–not pending) 13 Ordinal scale: The values of an ordinal variable can be ordered. However, the differences between these values cannot be interpreted in a meaningful way. For example, the possible values of education level (none–primary education–secondary education–university degree) can be ordered meaningfully, but the differences between these values cannot be interpreted. Likewise, the satisfaction with a product (unsatisfied–satisfied–very satisfied) is an ordinal variable because the values this variable can take can be ordered, but the differences between “unsatisfied– satisfied” and “satisfied–very satisfied” cannot be compared in a numerical way 14 Continuous scale: The values of a continuous variable can be ordered. Furthermore, the differences between these values can be interpreted in a meaningful way. For instance, the height of a person refers to a continuous variable because the values can be ordered (170 cm, 171 cm, 172 cm, …), and differences between these values can be compared (the difference between 170 and 171 cm is the same as the difference between 171 and 172 cm). Sometimes, the continuous scale is divided further into subscales. 15 Interval scale: Only differences between values, but not ratios, can be interpreted. An example for this scale would be temperature (measured in ◦C): the difference between −2 ◦C and 4 ◦C is 6 ◦C, but the ratio of 4/ − 2 = −2 does not mean that −4 ◦C is twice as cold as 2 ◦C. Ratio scale: Both differences and ratios can be interpreted. An example is speed: 60 km/h is 40 km/h more than 20 km/h. Moreover, 60 km/h is three times faster than 20 km/h because the ratio between them is 3. Absolute scale: The absolute scale is the same as the ratio scale, with the exception that the values are measured in “natural” units. An example is “number of semesters studied” where no artificial unit such as km/h or ◦C is needed: the values are simply 1, 2, 3,... 16 17

Use Quizgecko on...
Browser
Browser