Dent 1007 Biostatistics Lecture Notes PDF

Document Details

CleanerEllipse1228

Uploaded by CleanerEllipse1228

Bahçeşehir University

Dr. Ayşe Sena Kabaş Sarp

Tags

biostatistics frequency distributions graphs data analysis

Summary

These lecture notes cover biostatistics, focusing on frequency distributions, graphs (bar graphs, pie charts, histograms, stem-and-leaf plots, line graphs), and clinical trials. The notes are from Bahçeşehir University's Dentistry Faculty.

Full Transcript

DENT 1007 BIOSTATISTICS Dr. Ayşe Sena Sarp Bahçeşehir University Dentistry Faculty Semester 1 Week 3 BIOSTATISTICS DENT1007 BIOSTATISTICS FREQUENCY DISTRIBUTIONS When population data are available, there are...

DENT 1007 BIOSTATISTICS Dr. Ayşe Sena Sarp Bahçeşehir University Dentistry Faculty Semester 1 Week 3 BIOSTATISTICS DENT1007 BIOSTATISTICS FREQUENCY DISTRIBUTIONS When population data are available, there are no uncertainties regarding the characteristics of the population; all statistical questions concerning the population are directly answered by observation or calculation. However, the data represent a sample of measurements taken from a population of interest… the sample data, not population data… The first step in summarizing data is to organize the data in some meaningful fashion. FREQUENCY TABLES The most convenient and commonly used method is a FREQUENCY DISTRIBUTION, in which raw data are organized in table form by class and frequency DENT1007 BIOSTATISTICS FREQUENCY DISTRIBUTIONS For nominal and ordinal data, a frequency distribution consists of categories and the number of observations that correspond to each category. Example 1 : Table F1, displays a set of nominal data of prosthodontic services provided at a large dental clinic during the period of 1991–1998. The number of gold crowns and metalceramic crowns provided during 1991–1998 DENT1007 BIOSTATISTICS FREQUENCY DISTRIBUTIONS EXAMPLE 2A: survey was taken to assess job satisfaction in dental hygiene. Table F2, presents a set of ordinal data of 179 responses to one of the questions in the survey questionnaire: “If you were to increase appointment length, could you provide better quality care for your patients?” oThere are five choices for the individual’s response: i.Strongly disagree, ii.Disagree, iii.Neutral, iv.Agree, and v.Strongly agree. Since there are five choices, a typical frequency distribution would have five categories as shown in Table. It is not necessary that a frequency distribution for the ordinal data should have all of the categories. Sometimes researchers would prefer combining two adjacent categories. For example, combine “strongly disagree” and “disagree,” and combine “agree” and “strongly agree.” The combined data would have three categories: disagree (67 individuals), neutral (49 individuals), and agree (63 individuals). DENT1007 BIOSTATISTICS FREQUENCY DISTRIBUTIONS EXAMPLE 3 : A study, conducted to evaluate the effect of disinfection of dentinal tubules by intracanal laser irradiation using an in vitro model. It has been speculated that a possible cause for root canal failure is the persistence of bacteria that have colonized dentinal tubules. To reduce this risk and time- consuming endodontic therapy, new equipment and materials are constantly being introduced. APPLY LASER DENT1007 BIOSTATISTICS FREQUENCY DISTRIBUTIONS EXAMPLE 3 : A study was conducted to evaluate the effect of disinfection of dentinal tubules Table F3 by in- tracanal laser irradiation using an in vitro model. The data represent the count of bacterial (Enterococcus faecalis) colonies found in the samples after they had been treated by the neodymium: yttrium- aluminum-garnet (Nd: YAG) laser. Table F4 Let’s do a simple display of raw data as in TableF3: Re arrangement of the data in ascending order enables us to learn more about the count of the bacterial colonies. It is easy to see from Table F4 the smallest count is 120, and the largest count is 588. There are several counts that are tied, for example, five samples have the same count of 304 bacterial colonies. DENT1007 BIOSTATISTICS FREQUENCY DISTRIBUTIONS EXAMPLE 3 : A study was conducted to evaluate the effect of disinfection of dentinal tubules Table F3 by in- tracanal laser irradiation using an in vitro model. To present raw data, discrete or continuous, in the form of a frequency distribution, we must divide the range of the measurements in the data into a number of non- overlapping intervals (or classes). The intervals need not have the same width, but typically they are constructed to have equal width. This will make it easier to make comparisons among different classes. If one class has Table F4 a larger width, then we may get a distorted view of the data. While summarizing the data, having too many intervals is not much improvement over the raw data. If we have too few intervals, a great deal of information will be lost. DENT1007 BIOSTATISTICS FREQUENCY DISTRIBUTIONS EXAMPLE 3 : Table F3 SO..? Table F4 HOW MANY INTERVALS SHOULD WE HAVE? DENT1007 BIOSTATISTICS Some authors suggest that there should be 10–20 intervals. EXAMPLE : A set of data containing a small number of measurements should have only a few Table F3 intervals, whereas a set of data containing thousands of measurements over a wide range of values may need more. The number of observations in the data and the range of values influence the determination as to how many intervals and how wide the intervals should be. In general, One should have the number of intervals approximately equal to the square root of Table F4 the number of observations. Let n denote the total number of measurements or data points. The number of intervals =√n  √90≅9.49 (The symbol “≅” means approximately equal.) For the bacterial colony data in F4 we will need about 9 or 10 intervals to construct a frequency distribution. DENT1007 BIOSTATISTICS EXAMPLE : Once the number of intervals has been selected, the interval width can be determined by dividing the range by the number of intervals. Table F3 Width of the Interval = Range of Data ÷ Number of Intervals Constructing a frequency distribution: 1- Select the number of non-overlapping intervals. 2- Select a starting point for the lowest class limit. This can be the smallest value in the data or any convenient number less than the smallest Table F4 observed value. 3-Determine the upper and lower limits for each interval. 4-Count the number of observations in the data that fall within each interval.. DENT1007 BIOSTATISTICS EXAMPLE : Constructing a frequency distribution: 1- Select the number of non-overlapping intervals. Table F4 2- Select a starting point for the lowest class limit. This can be the smallest value in the data or any convenient number less than the smallest observed value. 3-Determine the upper and lower limits for each interval. 4-Count the number of observations in the data that fall within each interval.. The number of intervals =√n=√90≅9.49 ( 10) Width of the Interval = Range of Data ÷ Number of Intervals = (588-120) /9,49≅ 50 Table F5 Frequency table for bacterial colony data. DENT1007 BIOSTATISTICS Although the intervals should have an equal width, there is one exception occurs when a distribution is open ended with no specific begining or ending values. DENT1007 BIOSTATISTICS Although the intervals should have an equal width, there is one exception occurs when a distribution is open ended with no specific begining or ending values. Age-related data DENT1007 BIOSTATISTICS EXAMPLE : Table F5: Restorative patients by age.  The frequency distribution for age is open-ended for the first and last classes.  The frequency distribution is an effective organization of data, but certain information is inevitably lost. In summary, a frequency distribution: is a meaningful, intelligible way to organize data. enables the reader to make comparisons among classes. enables the reader to have a crude impression of the shape of the distribution. DENT1007 BIOSTATISTICS Relative Frequency A relative frequency distribution shows the proportion of the total number of measurements associated with each interval. Example: A relative frequency distribution for bacterial colony data. A proportion is obtained by dividing the absolute frequency for a particular interval by the total number of measurements. 19÷90≅ 0,21 or(19÷90)x100% ≅21.0% Relative frequencies are useful for comparing different sets of data containing an unequal number of observations. DENT1007 BIOSTATISTICS Relative Frequency Relative frequencies are useful for comparing different sets of data containing an unequal number of observations. Table here displays the absolute, relative, and cumulative relative frequencies. The cumulative relative frequency for an interval is the proportion of the total number of measurements that have a value less than Example: the upper limit of the interval. A relative frequency distribution for bacterial colony data. The cumulative relative frequency is computed by adding all the previous relative frequencies and the relative frequency for the specified interval. The cumulative relative frequency is also useful for comparing different sets of data with an unequal number of observations. DENT1007 BIOSTATISTICS GRAPHS Although a frequency distribution is an effective way to organize and present data, graphs can convey the same information more directly. Graphs help us visualize data. Graphs make data look “alive.” DENT1007 BIOSTATISTICS Bar Graphs In a bar graph categories into which observations are tallied appear on the abscissa (X-axis) and the corresponding frequencies on the ordinate (Y -axis). Y (Y -axis) The height of a vertical bar represents the number of observations that fall into a category (or a class). X (X-axis) DENT1007 BIOSTATISTICS Bar Graphs Example: How an estimated 120,000 deaths each year from hospital errors compare with the top five leading causes of accidental death in the United States. DENT1007 BIOSTATISTICS Bar Graphs Example: A survey conducted to find out how many cases of seizures have occurred in dental offices Seizure incidents in dental offices The number of seizures in dental offices. Since the number of respondents is not the same for the specialty areas in dentistry, the height of the vertical bars should represent the percentages as shown in Figure. DENT1007 BIOSTATISTICS Pie Charts Categorical data are often presented graphically as a pie chart, which simply is a circle divided into pie- shaped pieces that are proportional in size to the corresponding frequencies or percentages. To construct a pie chart, the A complete circle frequency for each category is corresponds to 360 converted into a percentage. degrees. The central angles of the pieces are obtained by multiplying the percentages by 3.6. The variable for pie charts can be nominal or ordinal measurement scale. DENT1007 BIOSTATISTICS Pie Charts DENT1007 BIOSTATISTICS Line Graph A line graph is used to illustrate the relationship between two variables. Each point on the graph represents a pair of values, one on the X-axis and the other on the Y -axis. For each value on the X - axis there is a unique corresponding observation on the Y -axis. Y (Y -axis) Once the points are plotted on the XY plane, the adjacent points are connected by straight lines. A(x,y) X (X-axis) It is fairly common with line graphs that the scale along the X-axis represents time. This allows us to trace and compare the changes in quantity along the Y -axis over any specified time period. DENT1007 BIOSTATISTICS Line Graph Example: A line graph that represents the data on the number of lifetime births per Japanese woman for each decade between 1930 and 2000. The rate has been declining steadily except for a break between 1960 and 1970. Japan has experienced a precipitous drop in the birth rate between 1950 and 1960. The lifetime births per woman in Japan in 2000 is less than one-third of that in 1930. The line graph tells us that since 1980, the birth rate in Japan has fallen below replacement level of 1.7–1.8 births per woman. If the current birth rate stays the same, Japanese population will continue to shrink. DENT1007 BIOSTATISTICS Line Graph We can have two or more groups of data with respect to a given variable displayed in the same line graph. Example: A database of specific prosthodontic treatments provided at LLa University School of Dentistry during the period of 1991– 1998. The top line for the metal- ceramic fixed partial One of the prosthodontic treatments of their dentures. interest was fixed partial dentures (FPD), subclassified by number of units involved and by gold or metal ceramic constituent materials. The bottom line for the gold fixed partial dentures DENT1007 BIOSTATISTICS Histograms A histogram is similar in appearance and construction to a bar graph except it is used with interval or ratio variables. That is, a histogram is used for quantitative variables rather than qualitative variables. The values of the variable are grouped into intervals of equal width. Like a bar graph, rectangles are drawn above each interval, and the height of the rectangle represents the number of observations in the interval. DENT1007 BIOSTATISTICS A histogram is a bar graph that represents a frequency distribution. The width represents the interval and the height represents the corresponding frequency. There are no spaces between the bars. Compare Bar Graphs and Histograms Histograms are used to show distributions of variables whereas bar charts are used to compare variables. Histograms plot quantitative data with ranges of the data grouped into intervals while bar charts plot categorical data. Note that there are no spaces between the bars of a histogram since there are no gaps between the intervals. On the other hand, there are spaces between the variables of a bar chart. DENT1007 BIOSTATISTICS Example: Histogram: Systolic blood pressure with 11 class intervals. The systolic blood pressure data of n = 112 patients DENT1007 BIOSTATISTICS Example: Histogram: Systolic blood pressure with 11 class intervals. The following are a few general comments about histograms: Histograms serve as a quick and easy check of the shape of a distribution of the data. The construction of the graphs is subjective. The shape of the histograms depends on the width and the number of class intervals. Histograms could be misleading. Histograms display grouped data. Individual measurements are not shown in the graphs. Histograms can adequately handle data sets that are widely dispersed. DENT1007 BIOSTATISTICS Stem and Leaf Plots The stem and leaf plot is a method of organizing data that uses part of the data as the ”stem” and part of the data as the ”leaves” to form groups. In stem and leaf plots measurements are grouped in such a way that individual observed values are retained while the shape of the distribution is shown. The stem and leaf plot consists of a series of numbers in a column, called the stem, with the remaining trailing digits in the rows, called the leaves. The stem is the major part of the observed values. DENT1007 BIOSTATISTICS A stem and leaf plot of stolic blood pressure (mmHG) of 112 patients. the systolic blood pressure data 96 98 The first column shows the frequency for each leaf. In the stem and leaf plot, the first one or two digits form the stems and the last digits of the observed values constitute the leaves. In a sense a stem and leaf plot is an ex- tension of a histogram. An important advantage of a stem and leaf plot over a histogram is that the plot provides all the information contained in a histogram while preserving the value of the individual observations. DENT1007 BIOSTATISTICS Stem and Leaf Plots The steps for constructing a stem and leaf plot can be summarized as follows: 1.Separate each value of the measurement into a stem component and a leaf component. The stem component consists of the number formed by all but the rightmost digit of the value. For example, the stem of the value 76.8 is 76 and the leaf is 8. For the value 45.6, the stem is 45 and the leaf is 6. 2. Write the smallest stem in the data set at the top of the plot, the second smallest stem below the first stem, and so on. The largest stem is placed at the bottom of the plot. Alternatively, the largest stem can be placed at the top of the plot and the smallest stem at the bottom. 3. For each measurement in the data, find the corresponding stem and write the leaf to the right of the vertical line or period. It is convenient, although it is not necessary, to write the leaves in ascending order, that is, the smallest first and the largest at the end of the row. DENT1007 BIOSTATISTICS CLINICAL TRIALS AND DESIGNS DENT1007 BIOSTATISTICS DENT1007 BIOSTATISTICS Thank you ! Dr. AYŞE SENA KABAŞ SARP BAHCESEHIR UNIVERSITY FACULTY of DENTISTRY Phone: + 90 532 441 25 60 E-mail: [email protected]

Use Quizgecko on...
Browser
Browser