Statistics PDF
Document Details
Uploaded by TriumphalSerpent
Our Lady of Fatima University
Tags
Summary
This document provides an overview of statistics, including its branches like mathematical statistics, applied statistics, and biostatistics. It discusses various types of data and levels of measurement. The document also examines data processing, including stages like data collection, preparation, input, processing, and storage. It explains the importance of data editing to ensure data quality.
Full Transcript
STATISTICS Branch of mathematics that examines and investigates ways to process and analyze the data gathered (Sirug, 2015) It provides a procedure in data collection, presentation, organization, and interpretation to have a meaningful idea that is useful to business deci...
STATISTICS Branch of mathematics that examines and investigates ways to process and analyze the data gathered (Sirug, 2015) It provides a procedure in data collection, presentation, organization, and interpretation to have a meaningful idea that is useful to business decision-makers (Sirug, 205) Science whereby inferences are made about a specific random phenomenon based on relatively limited sample material (Rosner, 2012) Bio o life Statistics o science dealing with the collection, organization, analysis, and interpretation of numerical data Biostatistics o gathered data in life and world o it is a special branch of statistics which deals with quantitative and qualitative aspects of vital phenomena o application of statistical methods to the life sciences like biology, medicine, and public health. 2 Main Areas: Mathematical Statistics Concerns the development of new methods of statistical inference and requires detailed knowledge of abstract mathematic for its implementation Covers the theory part of the subject Applied Statistics Involves applying the methods of mathematical statistics to specific subject areas Includes design of experiment and surveys for data collection Biostatistics o Branch of applied statistics that applies statistical methods to medical and biological problems Uses of Biostatistics Epidemiology distribution and determinants of health Demography size, structure, composition, and distribution of the human population Health Economics to know whether the health care systems is functioning well and the health affecting behaviors Genetics heredity and genomics genes and its functions PSA (Philippine Statistics Authority) is the central statistical authority of the Philippine government that collects, compiles, analyzes and publishes statistical information on economic, social, demographic, political affairs and general affairs of the people of the Philippines and enforces the civil registration functions in the country Division of Statistics: Descriptive Statistics The totality of methods and treatments employed in collecting, describing, and analyzing numerical data The purpose of descriptive statistics is to tell something about a particular group of observation It is a complete set of data Methods of summarizing and presenting data Computation of measures of central tendency and variability Dispersion and location Tabulation and graphical presentation Facilitate understanding, analysis, and interpretation of data Major Advantage: to premise researcher to describe the information in many scores with just a few indices Inferential Statistics The logical process from sample analysis to a generalization or conclusion about a population It is also called statistical inference or inductive statistics Techniques of Inferential Statistics: o Estimation - it is a process by which a statistics computed from a random sample and is used to approximate the corresponding parameter in the population Point Estimate - exact Interval Estimate - range o Hypothesis - it refers to the set of procedures that culminates in the rejection or non-rejection of your null hypothesis and based on probability of occurrence of sample if null hypothesis were true Relationship between population and sample: Population Consists of all the members of the group about which one wants to draw a conclusion Also known as universe Parameter - it is a numerical index describing a characteristic of a population Sample Is a portion, or part, of the population of interest selected for analysis Subset of population Statistic - it is a numerical index describing a characteristic of a sample (minus 1 to neutralize) Sources of Data: Primary Data Data that come from an original source, and are intended to answer specific research questions, can be taken by interview, main-in questionnaire, survey, or experimentation Obtained firsthand by the investigator Secondary Data Data that are taken from previously recorded data, such as information in research conducted, industry financial statements, business periodicals, and government reports It can also be taken electronically (via internet websites, compact disk) Tertiary Data Data that came from primary and secondary data Types of Data Constant Is a characteristic of objects, people, or events that does not vary value of characteristics that remain the same from person to person, time to time, or place to place Standard Ex. Temperature at which water boils (100C) is a constant Variable A characteristics of objects, people, or events that can take of different values It can vary in quantity (weight of people) or quality (hair color of people) The phenomenon whose values are categorized cannot be predicted with certainty Qualitative Variable o Variable that is conceptualized and analyzed as distinct categories, with no continuum implies o Also termed categorical variable; observations that are put in the same or different classes, each class being considered as possessing some common characteristic that is not shared by those in other classes o Non-quantifiable and only for label o Eye color, gender, occupation, religious preference Quantitative Variable o Variable that is conceptualized and analyzed along a continuum implied o It differs in amount of degree o Also termed numerical variable; variates that yield frequencies when counted, giving rise to discrete variables or when measured yield metric or continuous variable o Height, weight, math aptitude, salary Classification of Variables Mathematical Classification Continuous (quantitative) Variable o A variable which can assume any of an infinite number of vales and can be associated with points on a continuous line interval o Includes fraction and decimals o Height, weight, volume Discrete (quantitative) Variable o A variable that consists of either a finite number of values or a countable number of values o Can assume only in integral values or whole number o Gender, courses, Olympic games Experimental Classification Independent Variable o Variables controlled by the experimenter or researcher and expected to affect the behavior of the subjects o The independent variable is also called explanatory variable Dependent Variable o Is some measure of the behavior of subjects and expected to be influenced by the independent variable o The dependent variable is also called outcome variable 4 Levels (scale) of Measurement Nominal Mutually exclusive and exhaustive Meaning it is used to differentiate classes of categories for purely classification or identification purposes It is the weakest form of measurement because no attempt can be made to account for differences within the particular category or to specify any ordering or direction across the various categories Nominal data are discrete variables Mutually exclusive is a property of a set of categories such that an individual or object is included in only one category Exhaustive is a property of a set of categories such that an individual or object must appear in a category Scale is only use as label only; always qualitative Ordinal Used in ranking It is somewhat stronger form of measurement because an observed value classified into one category is said to possess more of a scaled property than an observed value classified into another category Nevertheless, no attempt is made to account for differences between the classified values within a particular category Moreover, ordinal scaling is still a weak form of measurement, because no meaningful numerical statements can be made about differences between the categories That is, the ordering implies only which category is "greater" or "lesser" -not how much "greater" or "lesser" Ordinal data are discrete variables Interval Used to classify order and differentiate between classes or categories in terms of degrees of differences Interval data are either discrete or continuous variables Interval is similar to ordinal, exact distance between all adjacent categories are equal but the zero point is arbitrary or meaningless Does not have a true-zero value starting point; always quantitative Ratio It differs from interval measurement only in one aspect; it has a true zero point (complete absence of the attitude being measured) With an absolute value point, it can be said that the ratios of two observations is "twice as fats", "half as long" or others Ratio data are either discrete or continuous variables Hospital Bed Capacity - Quantitative, discrete, ratio Educational attainment - Qualitative, ordinal Mid-upper arm circumference - Quantitative, continuous, ratio Forced expiratory volume - Quantitative, continuous, ratio Region of residence - Qualitative, nominal DATA PROCESSING Systematic procedure to ensure that the information/data gathered are complete, consistent, and suitable for analysis Stages of Data Processing Data Collection Collecting data is the first step in data processing The data sources available must be trustworthy so the data collected is of the highest possible quality Data Preparation Once the data is collected, it then enters the data preparation It is often referred to as "pre-processing", this is where raw data is cleaned up and organized for the following stage data processing The purpose of this stage is to eliminate incomplete, or incorrect data and begin to create high-quality data Data Input The clean data is then entered Data input is the first stage in which raw data begins to take the form of usable information Data Processing During this stage, the data inputted is processed for interpretation Data output/interpretation In this stage, data is finally usable by non-data scientists. It is translated, readable, and often in the form of graphs, videos, images, or plain text Data storage and report writing The final stage of data processing, it is stored for future use Data Coding Conversion of verbal/written information into numbers that can be more easily encoded, counted, and tabulated Field Code o Actual value or information given by the respondent Bracket Code o Recorded as a range of values rather than actual values Factual Code o Codes are assigned to a list of categories of a given variable Pattern Code o Applicable for questions with multiple responses Coding Manual A document that contains a record of all codes assigned to the responses to all questions in the data collection forms Minimum information that must be included in a coding manual: o Variable name o Variable description o Coding instructions Data Encoding Entering the data/responses in a spreadsheet MS Excel, MS Access, Epi Info Data Editing Inspection and correction of any errors or inconsistencies in the information collected During data collection, encoding, before data analysis Field Editing o Reviewing the accomplished data collection forms o Decoding of abbreviations or special symbols o Making callbacks/messages for verification/clarification of incomplete answers Central Editing o Checking inconsistencies and incorrect entries after receiving the questionnaire from the field o Checking of encoded data Importance of Data Editing Make corrections as early as possible Reduce non-response or incomplete answers Eliminate inconsistencies and incorrect info Make the entries clear, legible, and comprehensive Prepare data for analysis What to check in Editing Data Check for duplicate entries Check the totals of each variable if the same as with the sample size For qualitative data, check if categories are consistent with what is specified in the coding manual For quantitative data, check the minimum and maximum if they are logical given the possible values of a variable