Biostatistics Lecture Notes PDF
Document Details
Uploaded by Deleted User
Daniel, W., et al
Tags
Summary
This document provides an introduction to biostatistics, covering fundamental concepts such as data, variables, and measurement scales. It also briefly touches on sampling methods and statistical inference. This information is helpful for those in the health sciences field.
Full Transcript
BIO1109 Biostatistics BIO1109 Biostatistics 0.1. Some Basic Concepts Data. The raw material of statistics is data. For our purposes we may define data as numbers. The two kinds of numbers that we use in statistics are numbers that result from the taking—in the usual sense of the term—of a meas...
BIO1109 Biostatistics BIO1109 Biostatistics 0.1. Some Basic Concepts Data. The raw material of statistics is data. For our purposes we may define data as numbers. The two kinds of numbers that we use in statistics are numbers that result from the taking—in the usual sense of the term—of a measurement, and those that result from the process of counting. Statistics. Statistics is a field of study concerned with (1) the collection, organization, summarization, and analysis of data; and (2) the drawing of inferences about a body of data when only a part of the data is observed. Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.1. Some Basic Concepts Sources of Data. Data are usually available from one or more of the following sources: 1. Routinely kept records 2. Surveys 3. Experiments 4. External sources Biostatistics. When the data analyzed are derived from the biological sciences and medicine, we use the term biostatistics to distinguish the application of statistical tools and concepts. Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.1. Some Basic Concepts Variable. If, as we observe a characteristic, we find that it takes on different values in different persons, places, or things, we label the characteristic a variable. Quantitative Variables. A quantitative variable is one that can be measured in the usual sense. Measurements made on quantitative variables convey information regarding amount. Qualitative Variables. Many characteristics can be categorized only, as, for example, a person is designated as belonging to an ethnic group, or a person, place, or object is said to possess or not to possess some characteristic of interest. Measurements made on qualitative variables convey information regarding attribute. Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.1. Some Basic Concepts Random Variable. Whenever we determine the height, weight, or age of an individual, the result is frequently referred to as a value of the respective variable. When the values obtained arise as a result of chance factors, so that they cannot be exactly predicted in advance, the variable is called a random variable. Discrete Random Variable. A discrete variable is characterized by gaps or interruptions in the values that it can assume. These gaps or interruptions indicate the absence of values between particular values that the variable can assume. Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.1. Some Basic Concepts Continuous Random Variable. A continuous random variable does not possess the gaps or interruptions characteristic of a discrete random variable. A continuous random variable can assume any value within a specified relevant interval of values assumed by the variable. Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.1. Some Basic Concepts Population. A population of entities as the largest collection of entities for which we have an interest at a particular time. A population of values as the largest collection of values of a random variable for which we have an interest at a particular time. Sample. A sample may be defined simply as a part of a population. Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.2. Measurement and Measurement Scales Measurement. This may be defined as the assignment of numbers to objects or events according to a set of rules. The various measurement scales result from the fact that measurement may be carried out under different sets of rules. Measurement scales are categorized as: 1. Nominal scale 2. Ordinal scale 3. Interval scale 4. Ratio scale Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.2. Measurement and Measurement Scales Nominal Scale. The lowest measurement scale is the nominal scale. As the name implies it consists of “naming” observations or classifying them into various mutually exclusive and collectively exhaustive categories. Ordinal Scale. Whenever observations are not only different from category to category but can be ranked according to some criterion, they are said to be measured on an ordinal scale. Interval Scale. The interval scale is a more sophisticated scale than the nominal or ordinal in that with this scale not only is it possible to order measurements, but also the distance between any two measurements is known. Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.2. Measurement and Measurement Scales Ratio Scale. The highest level of measurement is the ratio scale. This scale is characterized by the fact that equality of ratios as well as equality of intervals may be determined. Fundamental to the ratio scale is a true zero point. The measurement of such familiar traits as height, weight, and length makes use of the ratio scale. Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.3. Sampling and Statistical Inference Statistical Inference. Statistical inference is the procedure by which we reach a conclusion about a population on the basis of the information contained in a sample that has been drawn from that population. Sampling. Sampling is the act, process, or technique of selecting a representative part of a population for the purpose of determining the characteristics of the whole population. It is subdivided into two groups: (1) probability sampling and (2) non- probability sampling. Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.3. Sampling and Statistical Inference Probability Sampling Methods. Probability sampling methods include simple random, systematic, stratified random, and clustered sampling methods. 1. Simple random sampling. If a sample of size n is drawn from a population of size N in such a way that every possible sample of size n has the same chance of being selected, the sample is called a simple random sample. The mechanics of drawing a sample to satisfy the definition of a simple random sample is called simple random sampling. Source: Methods of sampling from a population. Retrieved from https://www.healthknowledge.org.uk/public-health- textbook/research-methods/1a-epidemiology/methods-of-sampling-population Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.3. Sampling and Statistical Inference Probability Sampling Methods. Probability sampling methods include simple random, systematic, stratified random, and clustered sampling methods. 2. Systematic sampling. Individuals are selected at regular intervals from the sampling frame. The intervals are chosen to ensure an adequate sample size. If you need a sample size n from a population of size x, you should select every x/nth individual for the sample. Source: Methods of sampling from a population. Retrieved from https://www.healthknowledge.org.uk/public-health- textbook/research-methods/1a-epidemiology/methods-of-sampling-population Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.3. Sampling and Statistical Inference Probability Sampling Methods. Probability sampling methods include simple random, systematic, stratified random, and clustered sampling methods. 3. Stratified random sampling. In this method, the population is first divided into subgroups (or strata) who all share a similar characteristic. It is used when we might reasonably expect the measurement of interest to vary between the different subgroups, and we want to ensure representation from all the subgroups. Source: Methods of sampling from a population. Retrieved from https://www.healthknowledge.org.uk/public-health- textbook/research-methods/1a-epidemiology/methods-of-sampling-population Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.3. Sampling and Statistical Inference Probability Sampling Methods. Probability sampling methods include simple random, systematic, stratified random, and clustered sampling methods. 4. Clustered sampling. In a clustered sample, subgroups of the population are used as the sampling unit, rather than individuals. The population is divided into subgroups, known as clusters, which are randomly selected to be included in the study. Source: Methods of sampling from a population. Retrieved from https://www.healthknowledge.org.uk/public-health- textbook/research-methods/1a-epidemiology/methods-of-sampling-population Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.3. Sampling and Statistical Inference Non-probability Sampling Methods. Nonprobability sampling methods include convenience, quota, judgement (or purposive), and snowball sampling methods. 1. Convenience sampling. Convenience sampling is perhaps the easiest method of sampling, because participants are selected based on availability and willingness to take part. Useful results can be obtained, but the results are prone to significant bias, because those who volunteer to take part may be different from those who choose not to (volunteer bias), and the sample may not be representative of other characteristics, such as age or sex. Source: Methods of sampling from a population. Retrieved from https://www.healthknowledge.org.uk/public-health- textbook/research-methods/1a-epidemiology/methods-of-sampling-population Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.3. Sampling and Statistical Inference Non-probability Sampling Methods. Nonprobability sampling methods include convenience, quota, judgement (or purposive), and snowball sampling methods. 2. Quota sampling. This method of sampling is often used by market researchers. Interviewers are given a quota of subjects of a specified type to attempt to recruit. Source: Methods of sampling from a population. Retrieved from https://www.healthknowledge.org.uk/public-health- textbook/research-methods/1a-epidemiology/methods-of-sampling-population Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.3. Sampling and Statistical Inference Non-probability Sampling Methods. Nonprobability sampling methods include convenience, quota, judgement (or purposive), and snowball sampling methods. 3. Judgement (or purposive) sampling. Also known as selective, or subjective, sampling, this technique relies on the judgement of the researcher when choosing who to ask to participate. Researchers may implicitly thus choose a “representative” sample to suit their needs, or specifically approach individuals with certain characteristics. This approach is often used by the media when canvassing the public for opinions and in qualitative research. Source: Methods of sampling from a population. Retrieved from https://www.healthknowledge.org.uk/public-health- textbook/research-methods/1a-epidemiology/methods-of-sampling-population Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.3. Sampling and Statistical Inference Non-probability Sampling Methods. Nonprobability sampling methods include convenience, quota, judgement (or purposive), and snowball sampling methods. 4. Snowball sampling. This method is commonly used in social sciences when investigating hard-to-reach groups. Existing subjects are asked to nominate further subjects known to them, so the sample increases in size like a rolling snowball. Source: Methods of sampling from a population. Retrieved from https://www.healthknowledge.org.uk/public-health- textbook/research-methods/1a-epidemiology/methods-of-sampling-population Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.3. Sampling and Statistical Inference Bias in Sampling. There are five important potential sources of bias that should be considered when selecting a sample, irrespective of the method used. Sampling bias may be introduced when: 1. Any pre-agreed sampling rules are deviated from 2. People in hard-to-reach groups are omitted 3. Selected individuals are replaced with others, for example if they are difficult to contact 4. There are low response rates 5. An out-of-date list is used as the sample frame (for example, if it excludes people who have recently moved to an area) Source: Methods of sampling from a population. Retrieved from https://www.healthknowledge.org.uk/public-health- textbook/research-methods/1a-epidemiology/methods-of-sampling-population Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.4. The Scientific Methods and the Design of the Experiments Scientific Method. The scientific method is a process by which scientific information is collected, analyzed, and reported in order to produce unbiased and replicable results in an effort to provide an accurate representation of observable phenomena. Step 1. Making an Observation. An observation is made of a phenomenon or a group of phenomena. This observation leads to the formulation of questions or uncertainties that can be answered in a scientifically rigorous way. Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.4. The Scientific Methods and the Design of the Experiments Step 2. Formulating a Hypothesis. In the second step of the scientific method a hypothesis is formulated to explain the observation and to make quantitative predictions of new observations. Often hypotheses are generated as a result of extensive background research and literature reviews. The objective is to produce hypotheses that are scientifically sound. Hypotheses may be stated as either research hypotheses or statistical hypotheses. Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.) BIO1109 Biostatistics 0.4. The Scientific Methods and the Design of the Experiments Step 3. Designing an experiment. The third step of the scientific method involves designing an experiment that will yield the data necessary to validly test an appropriate statistical hypothesis. This step of the scientific method, like that of data analysis, requires the expertise of a statistician. Improperly designed experiments are the leading cause of invalid results and unjustified conclusions. Further, most studies that are challenged by experts are challenged on the basis of the appropriateness or inappropriateness of the study’s research design. Daniel, W., et al (2013). Biostatistics. A Foundation for the Analysis in the Health Sciences (10th ed.)