Introduction to Statistics PDF - UKM
Document Details
Uploaded by SelectiveAltoSaxophone
Universiti Kebangsaan Malaysia
Chong Wei Wen, PhD
Tags
Summary
These presentation slides provide an introduction to statistics, covering topics such as types of statistics, data types, variables, descriptive statistics, inferential statistics and more. The presentation is geared toward an undergraduate-level audience in the Faculty of Pharmacy at UKM (Universiti Kebangsaan Malaysia).
Full Transcript
INTRODUCTION TO STATISTICS NFNF3632 Statistics and Research Methodology CHONG WEI WEN, PhD FACULTY OF PHARMACY UNIVERSITI KEBANGSAAN MALAYSIA OUTLINE/ LEARNING OBJECTIVES ❑ What is Statistics? After this session, you will be able ❑ Why Study S...
INTRODUCTION TO STATISTICS NFNF3632 Statistics and Research Methodology CHONG WEI WEN, PhD FACULTY OF PHARMACY UNIVERSITI KEBANGSAAN MALAYSIA OUTLINE/ LEARNING OBJECTIVES ❑ What is Statistics? After this session, you will be able ❑ Why Study Statistics? to: ▪ Define statistics ❑ Basic Terms of Statistics ▪ State the use of statistics ▪ Explain the basic terms of ❑ Types of Statistics statistics ▪ Identify types of data and levels ❑ Types of Data and Variables of measurement ❑ Scales of measurement What is Statistics? 1. Statistics refers to numerical facts. Example: Malaysia’s cumulative Covid-19 cases now stand at 4,079,242. The nation’s total cases breached the four million mark on March 21 (4,010,952) cases. (The Star, 25 March 2022) Smartphone is the most popular device for people to access the Internet (89.3%) while the percentage of smartphone ownership among Internet users rose from 74.3% in 2014 to 90.7% in 2015. (Internet Users Survey 2016) What is Statistics? 2. Statistics is the science of collecting, organizing, summarizing, and analyzing data, as well as of making decisions based on such analyses. Biostatistics ▪ Biostatistics is the application of statistics in medicine and other health-related disciplines. Example: The Framingham Study – a longitudinal study conducted among residents in a city of Framingham, Massachusetts to identify the factors that contribute to cardiovascular disease (www.nhlbi.gov/aboutframingham/) Why study statistics? ▪ To collect, calculate, & interpret healthcare data appropriately ▪ Evaluate medical literature ▪ Interpret information on drugs & equipment ▪ Understand epidemiological problems – reveal prevalence of disease, risk factors of a disease ▪ Participation in research projects Basic Terms Population Versus Sample ▪ Population/ Target Population A population consists of all elements – individuals, items, or objects – whose characteristics are being studied. The population that is being studied is called the target population. ▪ Sample A portion of the population selected for study Basic Terms ▪ Sampling A procedure of selecting an adequate number of elements from a population with the objective to generalize the characteristics of sample to the population sampled ▪ Representative Sample A sample that represents the characteristics of the population as closely as possible ▪ Random Sample A sample drawn in such a way that each element of the population has a chance of being selected Basic Terms ▪ Element / Member A specific subject or object included in a sample or population ▪ Variable A characteristic under study that assumes different values for different elements ▪ Observation/ Measurement The value of a variable for an element ▪ Data/ Data Set A collection of observations or measurements on one or more variables Example Table 1 Life expectancy (years) for 2011 of selected countries Country Life Expectancy (years) Australia 81.81 Canada 81.38 France 81.19 Morocco 75.90 Poland 76.05 Sri Lanka 75.73 United States 78.37 Source: CIA World Factbook Basic Terms ▪ Parameter A numerical value that describes the entire population A parameter is usually derived from measurements of the elements in the population ▪ Statistic A numerical value that describes a sample A statistics is usually derived from measurements of the individuals in the sample Example Types of Statistics Statistical methods can be classified into two broad categories: a. Descriptive Statistics b. Inferential Statistics Descriptive Statistics ▪ Descriptive statistics consists of Example: methods for organizing, displaying, and describing data by using tables, graphs and summary measures. Inferential Statistics ▪ Inferential statistics consists of methods that use sample data to help make decisions or predictions about a population. ▪ Example: We may want to find the starting salary of a typical college graduate. To do so, we may select 2000 recent college graduates, find their starting salaries, and make a decision based on this information. Example TYPES OF DATA o Qualitative vs quantitative variables o Discrete vs continuous variables o Cross-section vs time series o Scales of measurement Variables & Data ▪ Variable A characteristic under study that assumes different values for different elements ▪ Data/ Data Set A collection of observations or measurements on one or more variables Qualitative versus Quantitative Variable Qualitative or Categorical Variables Variables that cannot be measured numerically but can be divided into different categories are called qualitative or categorical variables The data collected on such variables are called qualitative data Examples: Gender Marital status Education level Race Type of occupation Quantitative Variables A variable that can be measured numerically is called a quantitative variable. Examples: Height Weight Blood pressure Serum cholesterol Heart rate Quantitative variables may be classified as either discrete variables or continuous variables Discrete Variable A variable whose values are countable is called a discrete variable A discrete variable can assume only certain values with no intermediate values Example: Number of students in a class Number of cars in a parking lot Number of children in a family Continuous Variable A variable that can assume any numerical value over a certain interval or intervals is called a continuous variable Example: Time taken to complete an exam Height Weight Cross-Section versus Time-Series Data Based on the time over which they are collected, data can be classified as either cross-section or time-series data. Cross-section data contain information on different elements of a population or sample for the same period of time. Time-series data contain information on the same element at different points in time. Example of Cross-Section Data Back to previous slide Example of Time-Series Data Table 1.2 Number of collisions between wildlife and civilian aircraft Year Number of collisions 1990 1000 1995 2000 2000 5000 2003 6000 Scales of Measurement Measurement is the assignment of values or observations to outcomes 4 different scales of measurement: o Nominal o Ordinal o Interval o Ratio Nominal Characterized by data that consists of names, labels or categories only Measurements on a nominal scale label and categorize observations, but do not make any quantitative distinctions between observations. Examples: Gender: Male/ Female Employment status: Unemployed/ Employed/ Retired Ordinal Data at the ordinal level of measurement can be arranged in some order However, differences between data values cannot be determined or meaningless Examples: Socioeconomic class: Upper, middle, lower Ranks e.g. surveys in which respondents are asked to rate a characteristic on a scale (1 to 5) Interval Represents values or observations that can be measured on an evenly distributed scale Interval data include units of equal size However, data at this level do not have a natural zero starting point Example: Temperature measured in Fahrenheit or Celcius Ratio Similar to interval data in that the intervals between units are of equal size However, there is a natural zero starting point (zero indicates absence of trait being measured) Examples: Height Weight Measurement Levels Differences between measurements, true Ratio Data zero exists Quantitative Data Differences between measurements but no Interval Data true zero Ordered Categories (rankings, order, or Ordinal Data scaling) Qualitative Data Categories (no ordering or direction) Nominal Data Thank you