Statistics Introduction L1 PDF
Document Details
Tags
Summary
This document provides an introduction to the fundamental concepts of statistics, including basic definitions and types of variables (qualitative and quantitative). It also covers topics such as community health and different data types, such as ratios, proportions, and rates, along with various statistical concepts. It also discusses the importance of sampling and its relevance to evaluating populations.
Full Transcript
STATISTICS INTRODUCTION AND DEFINITIONS Public health (Community Medicine) Is the art & science of preventing diseases, promoting health & prolonging life through the organized efforts of society. It deals with the health of the whole population. Public health prof...
STATISTICS INTRODUCTION AND DEFINITIONS Public health (Community Medicine) Is the art & science of preventing diseases, promoting health & prolonging life through the organized efforts of society. It deals with the health of the whole population. Public health professionals with the aids of Public health programmes, work to improve the health of the population.(=Community ) 2 Public health (Community Medicine) Looks at the community (population) it self as a patient. Deals with the community as a social system and with the structure, function & dysfunction of such system. 3 Scope of Community Medicine Epidemiology of infectious diseases and chronic diseases, statistics, school health, mental health, maternal and child health (MCH), environmental health, rural health, urban health and occupational health Clinician versus community :physician (GP, FP) A clinician's target is to diagnose and treat his patient, while a community physician's target is to manage health problem in a community (i.e. Community diagnosis). Investigations in community medicine follow the sequence of; observation, hypothesis formulation, hypothesis testing, data collection and analysis, results interpretation and conclusions. STATISTICS A field of study concerned with methods and procedures of: Collection, Organization, Classification, Summarization ( Descriptive Statistics) , Analysis, and Drawing of inferences ( Analytic Statistics) about a body of data when only part of it is observed. BIOSTATISTICS When the data are derived from biological and medical sciences PURPOSES of Statistics 1. Data reduction(sumerization) thus facilitating interpretation. 2. To see the effect of a certain event is a real one or arise from chance fluctuation because of error in the sample of subjects (Analysis). 3.Sampling and generalization. DATA Are raw material of statistics (may be numbers or not) ,it comprise observations on one or more variable. The VARIABLE : A characteristic that takes different values in different persons, places, things,… I. QUANTITATIVE VARIABLE The variable that can be measured in the usual sense of measurement as age , weight, height,… So it is measurable. We have 2 subtypes of quantitative v. 1. Discrete variable 2. Continuous variable DISCRETE VARIABLE It is characterized by gaps or interruptions in the values (integer values) that can be assumed as number of admissions, decayed teeth,… So for example we can not say that we have 2.5 decayed teeth (it is either exactly 2 or 3). CONTINOUS VARIABLE It does not posses the gaps or interruption characteristic of discrete variable as weight, height, mid-arm circumference,… Or it is infinite continuum of possible values. II. QUALITATIVE VARIABLE It is the variable that can not be measured in the usual sense but can be described. In this case we count the number of individuals falling into each category (frequency) as the socioeconomic status, diagnostic category,… II. QUALITATIVE VARIABLE 1. NOMINAL Variable It uses names, numbers or other symbols. Each measurement assigned to a limited number of unordered categories and fall in only one category For example the gender either male or female. II. QUALITATIVE VARIABLE 2. ORDINAL Variable Each measurement is assigned to one of a limited number of categories that are ranked in a graded order. Differences among categories are not necessary equal and often not measurable For example the stage of a cancer may fall in stage 1, 2 or 3 according to the progression of the disease. Types of Variables Derived data We may encounter a number of other types of data in the medical field. These include: 1. Ratios 2. Proportions( as 0.3) and percentages (30%) 3. Rates Ratio Used to compare two quantities Usually the numerator is not a component of the denominator Commonly used ratios: Ratio of female to male births Maternal mortality ratio The ratio of people with tuberculosis to those without tuberculosis. Proportion and Percentage A specific type of ratio in which the numerator is included in the denominator, usually presented as a percentage Calculation of proportion: Number of infants who are immunized in Dywaniya Total eligible infants for immunization in Dywaniya 55,000 73.3% 75,000 Rate Special form of proportion that includes a specification of time Most commonly used in epidemiology because it most clearly expresses probability or risk of disease or other events in a defined population over a specified period of time POPULATION POPULATION OF ENTITIES Largest collection of entities that had common characteristics for which we have an interest at one particular time POPULATION OF VARIABLES It is the largest collection of values of a random variable for which we have an interest at a particular time SAMPLE It is part or subset of the population Sample of entities, which is a subset of population of entities Sample of variables which is subset of population of variables Statistical inference: is a conclusion concerning a population of observations made on the basis of the results of studying a sample of these observations Sampling Error: it is the difference between a sample measure and its corresponding population measure. Sampling error is not a mistake, but it is a calculated error that should be quantify Sampling frame: It is a numerical list of all the units composing the study population i.e. every member of the population has a unique identification number Probability sampling: The results of studying the sample are generalizable to the underlying population from which this sample had been drawn Non-probability sampling: The results of studying the sample can not be generalized to the underlying population from which this sample had been drawn The sample is drawn from the population in such a way that every member of the population has the same probability (chance) to be included in the sample A. Simple random sample: It requires: Sample frame: a numerical list of all observations (or units) composing the population Sample fraction: sample size to the total population The sample is selected by use of: Lottery method, computer or by random number table Choosing units or observations from the sample frame at regular interval (every nth) To find the “system” we divide the population size by the required sample size e.g. if the population is composed of 1200 units & we want to select a sample of 100, we can use the interval: 1200/100=12 So we will choose every 12th The starting point can be chosen at random like 2, so the selected units will be 2, 14, 26 ,…. Simple random sample & systematic sample can't ensure that the structure of the sample will be similar to the structure of the underlying population regarding certain characteristics, as age , sex,… The sample frame is divided into strata (or groups) according to certain characteristic (s), and then a simple random or systematic sampling will be applied on each stratum. The number of units included in the sample from each stratum can be achieved by: Equal allocation from each stratum Proportional allocation: The number from each stratum is proportional to the size of the stratum, this is possible only if we know the proportion of population in each stratum to the whole study population The selection here will be of groups of units rather than individual units A sample frame of groups of study units (cluster) should be available, then a random sample of these clusters will be chosen. Cluster could be schools, clinics, hospitals, villages, factories,….. This procedure is carried out in phases (stages). It can involve more than one of the above sampling method. It is used for very large population & when the sample frame is not available for the whole population Convenience sample: The study units that happen to be present at the time of data collection will be included in the sample. This is not representative to the population we want to study Quota sample: The composition of the sample regarding certain characteristic and the number of units having these characteristics are decided from the beginning, and the only requirement is to find the people fitting these quotas The results of non probability sample are valid (to the studied sample) but are not generalizable to the underlying population So non probability sample are inappropriate if the aim is to generalize finding obtained from the sample to the total study population Sampling may be the only feasible method of collecting the information Sampling reduce demand on resources (finance, personnel, materials) Results are obtained more quickly Sampling may lead to better accuracy of collected data Precise allowance can be made for sampling errors There is always sampling error Sampling may create a feeling of discrimination within the population Sampling may be inadvisable when every unit in the population is legally required to have a record For rare events, small samples may not yield sufficient cases for study THANK YOU