Introduction to Biostatistics - Units of Statistical Measurement PDF
Document Details
Uploaded by IlluminatingMiracle5126
Federal University of Technology Akure
Dr Stephen Fagbemi
Tags
Summary
This document provides an introduction to biostatistics, focusing on the nature of statistical data and different units of measurement. It details various types of variables and scales, such as nominal, ordinal, interval, and ratio. The document also describes the sources of data used in biostatistical studies, including routine collections and ad-hoc collections.
Full Transcript
Nature of statistical data; Unit of statistical measurement ----------------------------------------------------------- ### Dr Stephen Fagbemi Learning Objectives =================== - After this class, you should be able to : - Define Statistics and biostatistics. - Explain the concept o...
Nature of statistical data; Unit of statistical measurement ----------------------------------------------------------- ### Dr Stephen Fagbemi Learning Objectives =================== - After this class, you should be able to : - Define Statistics and biostatistics. - Explain the concept of variables and data - Define the basic concept of measurement - Compare and contrast scales of measurement with examples and the operations that can be done - Identify sources of data - Describe measurement in public health - Solve problems on scales of measurements Outline ======= - Introduction - Biostatistics & uses - Concept of variables and data - Basic concept of measurement; scales of measurement - Sources of data - Measurement in Public Health Introduction ============ - Statistics and Biostatistics : - Statistics is the study and the use of theory and methods for the analysis of data arising from random process or phenomena. - The study of how we make sense of data - Provides the most fundamental tools of scientific methods - Forming hypothesis - Designing studies( experimental or observational) - Gathering data - Summarising data - Drawing inferences from the data( e.g hypothesis testing) Introduction ============ - Divided into two - Mathematical statistics -- the study and development of statistical theory and methods in abstract - Applied statistics -- the application of statistical methods to solve real problems involving randomly generated data and the development of new statistical methodology motivated by the problems. - Biostatistics & uses - It is the branch of applied statistics directed towards application in health and biological sciences Introduction ============ - The science which deals with development and application of the most appropriate methods for the: - Collection of data. - Presentation of the collected data. - Analysis and interpretation of the results. - Making decisions on the basis of such analysis - When different statistical methods are applied in biological, medical and public health data they constitute the discipline of biostatistics. Why Biostatistics and not statistics ? ====================================== - Cause some statistical methods are more heavily used in health sciences than elsewhere(e.g., Survival analysis, longitudinal data analysis). - Cause examples are drawn from health sciences, it makes the subject more appealing to those interested in health. - Variables in Statistics - Are different characteristics that take different values in different persons, places and things, or the characteristics and properties, we wish to observe among members of a group (sample) which differ from one another. - Data - All the information regarding all the variables in the study is called data. - Observation of random variables made on the elements of a population or - Quantities(numbers) and qualities( attributes ) measured or observed that are to be collected and or/ analysed. - ''Data'' plural, ''datum'' singular - A collection of data is often called data set Types of variables ------------------ - ### Qualitative variable: - It is a variable or characteristic which cannot be measured in quantitative form but can only be identified by name or categories. - Have values that are intrinsically non-numeric - Generally have either ordinal or nominal scale - Can be reassigned numeric values ( male=1 female =0) are still intrinsically qualitative - E.g., - place of birth - ethnic group - type of drug - degree of pain (mild, moderate, severe ) - stages of breast cancer (I, II, III, or IV) Types of variables ------------------ - A quantitative variable is one that can be measured and expressed numerically. - Have values that are intrinsically numeric - It can be either discrete or continuous. - Have a set of values that either finite or countably infinite - There are gaps between its possible values - Often take integers (whole numbers) but some can take non-integer values - E.g., - Number of children - Number of pregnancies - Number of episodes of diarrhoea ![](media/image4.jpeg) Types of variables ------------------ - Continuous variable is a measurement on a continuous scale - Has a set of possible values including all values in an interval of the real line - Examples are - Weight - Height - Blood pressure - Age - BMI - Although the types of variables could be broadly divided into categorical - The common practice is to see four basic types of data describe as scales of measurement or levels of measurement. Basic concept of measurement ============================ - Measurement is the process of systematically assigning numbers to objects and their properties to facilitate the use of mathematics in studying and describing objects and their relationships. - Measurement is not limited to physical qualities such as height and weight alone. - Tests to measure abstract constructs such as intelligence or scholastic aptitude are commonly used in education and psychology and are forms of measurement - In statistics, the term measurement is used more broadly and is more appropriately termed scales of measurement. Basic concept of measurement ============================ - Scales of measurement refer to ways in which variables/numbers are defined and categorized. - Each scale of measurement has certain properties which in turn determines the appropriateness for use of certain statistical analyses. - The four scales of measurement are nominal, ordinal, interval, and ratio Scales of measurement --------------------- - ### Nominal data - Represent categories or names. - There is no implied order to the categories of nominal data. - In these types of data, individuals are simply placed in the proper category or - Each item must fit into exactly one category. - E.g., - Race - Eye colour - Gender - Marital status - Religious affiliation - Blood group ( A, B,AB, O) - Characteristic question; Is A different from B ? Is male different from female ![](media/image6.jpeg) Scales of measurement ===================== - ### Ordinal Data - Have order among the response classifications (categories) - Intervals between the categories are not necessarily equal - For example: strongly agree, agree, no opinion, disagree, strongly disagree. - In this situation, we only know that the data are ordered. - E.g., - Stages of disease ( breast cancer I, II, III, IV) - Levels of pain (Mild , moderate severe) - Levels of satisfaction - Characteristic question; Is A bigger than B ? Is moderate bigger than mild ![](media/image8.jpeg) Scales of measurement ===================== - ### Interval Data - Measurement are expressed in numbers except that the starting point is arbitrary depending largely on the units of measurement - In interval data the intervals between values are the same for example, in the Fahrenheit temperature scale, - The difference between 70 degrees and 71 degrees is the same as the difference between 32 and 33 degrees. - The interval between any two intervals is dependent on their unit of measurement - But the scale is not a RATIO Scale, in other words 40 degrees Fahrenheit is not - E.g., - Temperature - Psychiatric diagnostic instrument( QOL brief) - SAT score - Characteristic question; By how many units do A differ from B ![](media/image10.jpeg) Scales of measurement ===================== - **Ratio Data:** - The data values in ratio data do have meaningful ratios, for example, age is a ratio - It has true Zero point with all properties of nominal, ordinal and interval scale - The ratio of any two measurement on the scale is physically meaningful - Most data analysis techniques that apply to ratio data also apply to interval data. Therefore, in most practical aspects, these types of data (interval and ratio) are grouped under metric data. - E.g., - Height ( 0in = 0cm) - Weight( 0lb=0kg) - Length - Distance - Characteristic question; How many times bigger than B is A ![](media/image12.jpeg) Operations that can be done --------------------------- **Scale** **Counting** ----------- -------------- -- -- -- Nominal \+ ordinal \+ interval \+ Ratio \+ ![](media/image15.jpeg) - There are primarily two sources of data I. Routine collections or regular source II. Ad-hoc collections (surveys and experiments) - Routine Collections : - These are established systems for continuous collection of health data. - Such systems include - Census - Vital registration - Hospital records (inpatient and outpatient records) - Schools - Armed forces - Insurance - Migration - Diseases notification systems (DSN) Sources of data =============== - Ad-hoc Collections: - Are special collections usually occasioned by the inadequate statistics derived from data that are routinely collected - Inadequacies can result from the fact that the data required is either in- complete, in accurate or non-existent. - Usually aimed at a specific time-limited study or tasks - Codified according to the goals at hand and the wishes of the investigator - E.g., - Hospital-In-Patient Enquiry - Social survey - Demographic and health surveys (DHS) - National Reproductive Health Survey (NARHS) - Epidemiological surveys - Data collection may cover entire population or part of it Other forms of classification of data ------------------------------------- - **Primary Data** is collected through the use of surveys, meetings, focus group discussions (FGD), interviews or other methods that involve contact with the respondents - **Secondary Data** is existing data that has been collected or will be collected for other purposes, by other organizations, government agencies or through research studies.