University of Botswana Statistics Lecture Notes PDF
Document Details
Uploaded by LuxuriantPsaltery
University of Botswana
Tags
Related
Summary
This document is a set of lecture notes from the University of Botswana, Department of Statistics, on Elementary Statistics (STA 111). It covers introductory concepts of statistics, including its history, definitions, types (descriptive and inferential), data presentation (sentence, table, graphical), applications, and potential misuse. The document aims to provide a foundation for understanding and applying statistical methods.
Full Transcript
University of Botswana Department of Statistics STA111: Elementary Statistics Chapter 1 Introductory Concepts of Statistics 1.1 What is Statistics? - History Reference https://www.emathzone.com/tutorials/basic-statistics/history-of-statistics.html The...
University of Botswana Department of Statistics STA111: Elementary Statistics Chapter 1 Introductory Concepts of Statistics 1.1 What is Statistics? - History Reference https://www.emathzone.com/tutorials/basic-statistics/history-of-statistics.html The origin of the term statistics is from a Latin word “status” or the Italian word “statista” or German word "Statistik“, all of which refer to the political state or government. In the past, statistics was used by rulers. The application of statistics was very limited, but rulers and kings needed information about land, agriculture, commerce, populations of their states to assess their military potential, their wealth, taxation and other aspects of government Brief history The origin of the term statistics is from a Latin word status which refers to the state. A German scholar Gottfried Achenwall coined the term to refer to “collection, processing and use of data by the state”. In the 19th century, the term statistics began to be associated with collection and interpretation or making inferences from data. 1.2 Definition of Statistics The word statistics has two definitions: 1) Statistics is a collection of numerical information (also called data). In its singular form, the word statistic means a number, numerical figure or value 2) As a field of study, statistics refers to the science of collecting, organizing, presenting, analysing and interpreting data to assist in making more effective decisions. What is Statistics? - History A German scholar Gottfried Achenwall coined the term statistics to refer to “collection, processing and use of data by the state”. In the 19th century, the term statistics began to be associated with collection, organising, analysis, interpretation and making inferences from sample data about the general population. 1.3 Presentation of Numerical facts Numerical facts can be presented in sentence form, table form or graphical form Examples: (a) Sentence Form In 1981 and 1991, the percent of children aged 5-9 years that were still at school in Botswana were 42.7 and 49.0 percent respectively. This improved significantly to 64.7 percent in 2001. (b) Table Form Table 1: The percent of children in Botswana aged 5-9 years still at school by census year Census Year Percent still at School 1981 42.7 1991 49.0 2001 64.7 c) Graphical Representation The percent of children in Botswana aged 5-9 years still at school by census year 1.4 APPLICATION OF STATISTICS Statistical data and methods are required in all forms of human endeavour: social sciences, politics, sports, health, accounting, business, management, history, law, life sciences etc Knowing statistics is part of Acquiring Knowledge and Achieving Success. 1.5 MISUSE OF STATISTICS Despite its strengths, Statistics is one of the most misused subjects. Some examples of misuse are in Darell Huff’s “How to Lie with Statistics” The causes for misuse of Statistics arise at all stages of the statistical research process such as: ◦ Inappropriate Methods of collecting data (sampling) ◦ Use of unrepresentative subgroups in a study ◦ Inappropriate methods for summarizing data ◦ Wrong choice of data analysis and inferential techniques ◦ the bias of the researcher ◦ selective reporting and wrong (unjustified) conclusions How to avoid Misusing Statistics Learn and understand your elementary course, especially the concepts Consult a Statistician before Carrying out Surveys and Analysis Start here on monday 1.6 Type of statistics The field of statistics is divided into two broad categories: Descriptive Statistics and Inferential statistics The two types of statistics are each important, and offer different methods or techniques that accomplish different objectives 1.6.1 Descriptive Statistics Descriptive statistics summarize the population data by describing what is observed in the sample numerically or graphically. Descriptive statistics tools include ◦ 1. Graphical displays ◦ 2. Measures of Central tendency and Variability ◦ Like mean and standard deviation for continuous data types (like heights or weights) ◦ frequency and percentage are more useful in terms of describing categorical data (like race) ◦ 3. Measures of shape of a distribution of data ◦ 4. Measures of position 1.6.2 Inferential statistics Inferential analyses goes further: they use the observed data to make general statements about the parent population from which the sample data were collected. These inferences may take the form of ◦ Answering ‘yes/no’ questions about the data (hypothesis testing) ◦ Estimating numerical characteristics of the data (estimation) 1.7 Basic Statistical Concepts Unit: A member of a set of entities or objects being studied. Examples of a unit would be a single person, a car, animal, plant, manufactured item or any object that belongs to a larger collection of such entities being studied. A Unit is often referred to as being either an experimental unit, sampling unit or unit of observation Data could be collected through measurement or observation of one or more characteristics of a unit. Population: The entire collection of objects or individuals units from which information is required is Basic Statistical Concepts Sample: a subset of the population Variable: A characteristic of a unit being observed that may assume more than one of a set of values to which a numerical measure or a category from a classification can be assigned. Examples: Shoe size, height, religion Constant: It is characteristic that retains the same value from one entity to the other Observation: The different values of a variable that one observes (or measures) are called observations. Basic Statistical Concepts Raw Data is numerical information in its original form in which it was collected, i.e., unprocessed numerical information Example: The votes garnered by different candidates from contesting parties in the Goodhope-Mabule constituency bye- election of August 15, 2015, are 1) Umbrella for Democratic Change: 6152 votes 2) Botswana Democratic Party: 4 372 votes 3) Botswana Congress Party: 385 votes Basic Statistical Concepts Parameter: numerical value that summarizes data for an entire population. That is, a parameter is a characteristic that describes an entire population. Statistic: numerical value that summarizes data for the sample. That is, a statistic describes a sample. Parameters and Statistics Mean: the average of the data sensitive to outlying data Median: the middle of the data not sensitive to outlying data Mode: most commonly occurring value Range: the spread of the data IQ range: the spread of the data commonly used for skewed data Standard deviation: a single number which measures how much the observations vary around the mean Data : these are observations of random variables made on the elements of a population or sample Symmetrical data: data that follow normal distribution (mean=median=mode) report mean & standard deviation & n Skewed data: not normally distributed (meanmedian mode) report median & IQ Range Examples of Statistics vs Parameters Sample statistic Population parameter 1. Proportion of 3000 randomly Proportion of all voters in selected voters from Gaborone Gaborone that support political that support political party X party X 2. Median income of a random Median income of all public sample of 1000 public service service workers in Botswana workers in Botswana 3. Average weight of a Average weight of all watermelon from Phiri farm in the watermelons in the entire Kgatleng District Kgatleng District 4. Mean completion time of an assignment for a randomly Mean completion time of an selected group of 3000 UB assignment for all UB students students Exercise It is known that the average age of university students in the year 2002 was 23 years. Information is needed expeditiously on the average age of students in 2003, and a section of students from each faculty is selected. From each selected student, values of the following are recorded: age, sex, marital status. Identify the i) population ii) sample iii) sampling unit variable(s)