Biostatistics PDF
Document Details
Uploaded by Deleted User
Ryleigh, Sydney, and Ella
Tags
Summary
This document provides a study guide on biostatistics, covering key concepts like statistics, probability, confidence intervals, and various types of bias. It also touches upon important aspects like the uses of biostatistics in healthcare, including screening processes and the assessment of risks. Topics such as the cycle of biostatistical analysis and data collection are also identified.
Full Transcript
PHC4101 BIO Study Guide Created by: Ryleigh, Sydney, and Ella 4.1 Ch. 7: “Statistics: Making Sense of Uncertainty” Statistics - a scientific discipline or method, a way of gathering and analyzing data to extract information, seek causation, and calculate pro...
PHC4101 BIO Study Guide Created by: Ryleigh, Sydney, and Ella 4.1 Ch. 7: “Statistics: Making Sense of Uncertainty” Statistics - a scientific discipline or method, a way of gathering and analyzing data to extract information, seek causation, and calculate probabilities Probability - used to describe the variety and frequency of past outcomes under similar conditions as a way of predicting what should happen in the future p-value - expresses the probability that the observed result could have occurred by chance alone ○ Typically a p-value < 0.05 is used as the criteria for a result to be considered statistically significant Confidence interval - a range of values within which the true value probably falls ○ Ex) You could construct a CI for IQ scores 95% confident that the true mean for IQ scores in a given population is between 85 and 100 Law of Small Probabilities - even an event with a very low probability can occasionally happen Power - the probability of an effect if there is an effect ○ Researchers often prefer to have a large number of subjects because it has more power than a small study False-negative: to find no effect when there actually is one False-positive: to find an effect that is not real Screening - checking for a disease when there are no symptoms Allows for physicians to detect diseases early when they are most treatable Ex) breast cancer, controlling HIV/AIDS Tests can be: ○ Highly sensitive: yields few false negatives Typically more desirable ○ Highly specific: yield few false positives Bias - the influence of irrelevant factors / confounding variables on a result or conclusion Some types of bias: ○ Lead time: increased survival time after a diagnosis is counted as an indicator of success ○ Overdiagnosis: diagnosis of a “disease” that will never cause symptoms or death during a patient’s lifetime Rates - relates raw data to the size of the population being considered Helpful for making comparisons and identifying trends Common rates used as indicators of health: ○ Death rate ○ Birth rate ○ Infant mortality rate ○ Maternal mortality rate Crude rate: actual rate Adjusted rate: useful in comparing rates from different groups by controlling for some important variable ○ ex) adjusting for age, sex, race, ethnicity Other calculated statistics Life expectancy at birth Years of potential life lost (YPLL) Quality-adjusted life years (QALY) Risk Assessment and Risk Perception Risk assessment: identifies events and exposures that may be harmful and estimates for the probability of their occurrence as well as the extent of harm they may cause ○ Often done based on historical data ○ Based on assumptions when there is little pre-existing data Risk perception: a person’s judgment of the likelihood and consequences of a potential risk ○ Risks are classified on two scales: dread and knowability More dreaded or unknown = less acceptable to the population Evaluation methods Cost-benefit analysis: weighs the estimated cost of implementing a new policy against the estimated benefit Cost-effectiveness analysis: compares the efficiency of different methods in obtaining the same results 4.1 Lecture Video: Definition of Biostatistics: ○ A subsection of statistics that deals with health science data collection, organization, presentation, analysis, and interpretation ○ Biostatisticians work with models, algorithms, and specific techniques to help understand problems and make inferences about certain populations What Kinds of Questions Can Biostatistics Answer? ○ The results of a clinical trial ○ The effectiveness of public health interventions ○ The relationship between health and environmental factors ○ Health care quality of certain social determinants ○ How genetics and genomics relate to certain diseases Where do Biostatisticians Work? ○ Pharmaceutical industry ○ Biotechnology industry ○ The government Ex: NIH, CDC, FDA ○ In academics ○ Hospitals Figure 1: The Cycle of Biostatistical Analysis Source: Fischer, Jonathan. (n.d.) A Brief Introduction to Biostatistics [Online Lecture]. https://mediasite.video.ufl.edu/Mediasite/Play/ac6ccec8d898462aa550d941b05c52101d What is a Biostatistical Analysis? ○ A repeating cycle with five steps: Take a sample from a larger population Produce data from the sample after running a study Analyze the data Ex: graphs, histograms, scatterplots Obtain a probability value through mathematical calculations Make an inference about the larger population Tools Used During a Biostatistical Analysis: ○ Computing software ex: Excel or R ○ Classical statistic knowledge (ex: from a high school statistics class) ○ A study design ○ Domain knowledge What are Lurking Variables? ○ Variables that could affect data, and without proper consideration, could lead to incorrect conclusions ○ An example of this is: Simpson’s Paradox: If a lurking variable is not accounted for, the conclusion will be inverted 4.2 Chapter 8: “The Role of Data in Public Health”: Health statistics- health data collected and analyzed, by public health workers to monitor the health of a community play a vital role in public health’s assessment function Health statistics can be used to: Identify special risk groups ○ To detect new health threats ○ To plan public health programs and evaluate their success ○ To prepare government budgets Health statistics collected by federal, state, and local government ○ serve as the raw material for research on epidemiology, environmental health, social and behavioral factors in health, and for the medical care system Federal Level National Center for Health Statistics (NCHS)- the primary agency that collects, analyzes, and reports data on the health of Americans Part of the Centers for Disease Control and Prevention (CDC). The NCHS gathers its data through two primary methods: States periodically send data compiled from local records, which include vital statistics such as nearly all births and deaths. Conducts surveys of representative population samples, focusing on: ○ Health status, ○ Lifestyle ○ Health-related behaviors ○ Illness onset and diagnosis ○ Disability ○ Medical care usage Some surveys on a state-by-state basis provide useful data for state/local communities Other federal agencies share their data with NCHS for broader use Birth and Death Data: Births and deaths are the most basic, reliable, and complete data collected. Almost every birth and death in the U.S. is recorded on a birth certificate or death certificate. ○ These certificates are filed by the attending physician, midwife, or other attendant with the local registrar. State health departments collect and transmit these reports to the NCHS periodically. Birth Certificates: Information provided by the mother includes family details ○ Names ○ Addresses ○ Ages ○ Race/ethnicity ○ Education ○ Medical information (e.g., prenatal care, birth weight, complications, and risk factors) is provided by the hospital or birth attendant. Some states now include the mother's tobacco use on birth certificates. Death Certificates: Might not be completely accurate, depending on the informant’s knowledge of the deceased, especially for elderly individuals. ○ Cause of death might be uncertain without autopsies ○ Social stigma can lead to misreporting causes of death Ex: AIDS or suicide Vital Statistics: Marriages Divorces ○ Marriages and divorces are universally reported but are less significant for public health. Spontaneous fetal deaths ○ Spontaneous fetal death reporting is incomplete, especially for early-term losses. Abortions ○ Induced abortions are often underreported, with some states excluding the woman’s name for confidentiality. Infant mortality is a key public health issue because it is an indicator of a population's health and well-being. The NCHS operates a system linking birth and death records for infants who died before their first birthday, aiding research on factors related to infant death. The US Census Vital statistics data must be converted into rates for meaningful use in public health. Key components for rate calculations: Numerator: Vital statistics (e.g., number of births, deaths). Denominator: Population size being referred to. Types of rates: Age-adjusted/age-specific rates ○ Requires knowing the population in each age group. Sex-specific/race-specific rates ○ Requires knowing the number of males, females, and individuals of different races in each age and sex group. The U.S. Census Bureau (Department of Commerce) Provides population counts and characteristics. An accurate population count is essential for precise health statistics. The U.S. Census- Provides data on the geographic distribution of the population, as well as its sex, age, ethnic, social, and economic characteristics (education, housing, health insurance status). Tracks population trends between decennial censuses using polls, surveys, birth/death records, immigration data, and school statistics. The U.S. Constitution- Mandates a population count every 10 years to determine representation in the House of Representatives. Uses of U.S. Census Data: Vital for the functioning of the nation’s social, political, economic, and industrial systems. Essential for public health practices. 2020 U.S. Census Innovations: First census to allow responses via the Internet and telephone. ○ Includes all U.S. residents, including citizens and non-citizens, and counts special populations (e.g., troops overseas, students, prisoners). Data collection began in January 2020 in Alaska, with most households contacted by March 2020. ○ Non-responding households received in-person visits starting in April 2020. Political and Financial Impact: Census data affects Congress's political representation and federal funds' distribution to states and communities. Interest groups closely monitor data collection methods to ensure fair representation and resource allocation. 2020 Census Citizenship Question Controversy: The Trump administration proposed adding a citizenship question, which has been absent since 1950, to enforce voting rights protections for minorities. Courts rejected the explanation as "contrived," citing concerns that the real intent was to suppress responses in immigrant communities. Cities and states with large immigrant populations feared an undercount, which would reduce political influence and federal funding. After legal battles, the citizenship question was excluded from the 2020 census. New Household Categories: The 2020 census included categories for same-sex and opposite-sex relationships. ○ Allowed more detailed public health statistics on homosexual couples and families. Census Errors: Some people, including the homeless, undocumented immigrants, and fugitives, may be missed. Wealthy individuals with multiple homes may be counted twice. In 2010, an estimated 10 million people were missed, and 36,000 were counted twice. Errors in census data can lead to inaccuracies in health statistics, particularly in birth and death rates for undercounted groups. Changes in Census Data Collection: Since 2010, only basic data (name, age, sex, race, ethnicity, relationships) are collected from everyone using the "short form." American Community Survey (ACS)- Collects more detailed population information The ACS, introduced in 2005, is conducted annually and sent to around 3 million households. It gathers detailed data on: ○ Education, housing, employment, transportation, and more, are useful for planning services like healthcare and schools. Opposition to the ACS: In 2012 and 2015, Republicans in Congress sought to eliminate the ACS, arguing it was too intrusive. Business groups and organizations, like the U.S. Chamber of Commerce, defended the ACS for providing critical data for government and business planning. The ACS remains in use despite repeated opposition. NCHS Surveys and Other Sources of Health Data 1. National Health Interview Survey (NHIS) Purpose: To assess the overall health of the U.S. population, track disease prevalence, and monitor trends in risk factors. Survey Method: ○ Approximately 35,000 households are interviewed each year. ○ Topics covered include illnesses, injuries, chronic conditions, impairments, access to healthcare, and utilization of medical resources. ○ This broad, household-based approach enables the NCHS to gauge population-wide health trends. 2. National Health and Nutrition Examination Survey (NHANES) Purpose: To provide detailed and accurate information on the health and nutritional status of Americans through direct physical examinations and laboratory testing. Survey Method: ○ Doctors and nurses are deployed in mobile vans to conduct physical and dental exams, as well as laboratory tests. ○ Each year, about 5,000 individuals from 15 counties are selected for participation. ○ The survey captures data on chronic conditions such as: Cardiovascular disease Diabetes Kidney disease Respiratory disease Osteoporosis Hearing loss ○ Additionally, NHANES tracks risk factors for these conditions, including smoking, alcohol use, sexual practices, physical activity, weight, and dietary intake. Example of NHIS and NHANES Data ○ In 2019, NHIS and NHANES data were used to study: Whole grain consumption in American diets. The number of teenagers attempting weight loss. How Americans were trying to reduce prescription drug costs.. Key Focus Areas of the National Health Care Survey: 1. Healthcare Facility Operations: ○ The survey tracks the operations of: Doctors' offices Community health centers Hospitals Nursing homes Hospice organizations Other healthcare facilities ○ The data collected provides a snapshot of healthcare provision, patient flow, and resource utilization across these diverse settings. 2. Electronic Health Records (EHRs): ○ ○ Adoption Rates: In 2017, 86% of office-based physicians used EHRs. ○ Interoperability Challenges: Only about 42% of physicians electronically shared records with other facilities, showing a significant gap in achieving full interoperability across the healthcare system. ○ This points to the need to improve data exchange between facilities to enhance patient care coordination and safety. 3. Hospital Emergency Room Overcrowding: ○ The survey highlighted differences in emergency room (ER) wait times between facilities of varying volumes: Lower-volume facilities: Average wait time of 24 minutes. Higher-volume facilities: Average wait time of 49 minutes. ○ Longer ER wait times can lead to: ○ Delayed treatment for heart attacks and other critical conditions Behavioral Risk Factor Surveillance Survey (BRFSS)-obtains information on health-related behaviors, conducted by the federal government in collaboration with the states Asks questions about Health: Risk factors ○ High blood pressure ○ High blood cholesterol ○ Diabetes ○ Weight Health-related behaviors ○ Diet and physical activity ○ Cigarette smoking ○ Alcohol use ○ Seat belt use ○ Drinking and driving Whether people get preventive medical care ○ Mammograms ○ Pap smears ○ Colon cancer screening ○ Immunizations. BRFSS vs NHANES The Behavioral Risk Factor Surveillance System (BRFSS) gathers health data similar to that of the NHANES but BRFSS surveys more people Scale: It is the largest continuously conducted health survey in the world, with over 400,000 interviews conducted annually. State-Level Analysis: BRFSS data allows for the analysis of health factors at the state level, providing localized insights into health trends. Self-Reported Data: BRFSS relies on self-reported data, which may be less reliable than NHANES, which uses direct measurements. Example of Discrepancy: ○ In 2011, BRFSS self-reports indicated about 28% of adults were obese. ○ However, NHANES, using direct measurements, found an obesity rate of 35%. ○ This discrepancy highlights the tendency of overweight individuals to underreport their weight. NCHS Health Surveys: The National Center for Health Statistics (NCHS) conducts a wide variety of health-related surveys to capture specific aspects of public health and well-being. Some key surveys include: National Youth Fitness Survey: Assesses the physical fitness of children and adolescents. National Survey of Family Growth: Collects data on family life, marriage, fertility, and reproductive health. National Immunization Survey: Gather information on vaccination rates across the U.S. population. Collaborative Surveys: NCHS collaborates with other federal agencies to expand the scope of its data collection: National Asthma Survey: Conducted with the CDC’s National Center for Environmental Health, it tracks asthma prevalence and management. National Infant Feeding Practices Study: In collaboration with the Food and Drug Administration (FDA), this survey examines infant nutrition and feeding patterns. National Health Interview Survey on Disability: Conducted in partnership with the Social Security Administration and other agencies, this survey collects data on the prevalence and impact of disabilities. Health-Related Data Collection by Other Government Agencies: Several other U.S. governmental agencies play vital roles in health surveillance based on their area of focus: Environmental Protection Agency (EPA): Monitors health hazards in the environment, including air pollutants and toxic chemical releases. National Cancer Institute (NCI): Runs the Surveillance, Epidemiology, and End-Results (SEER) program to track long-term cancer incidence and mortality trends. Centers for Medicare and Medicaid Services (CMS): Maintains billing records for the Medicare program, which are valuable for research on healthcare utilization and outcomes. Food and Drug Administration (FDA): Collects data on adverse drug reactions post-approval and can recommend product recalls for pharmaceutical safety. Consumer Product Safety Commission (CPSC): Conducts surveillance for product-related injuries to ensure consumer safety. Surveillance systems- form the basis of effective public health practice as well as the planning and evaluation efforts that are increasingly being used in public health programming. Data-driven surveillance systems are essential for early detection of health threats, such as communicable diseases, enabling timely responses to protect public health.. Public health relies on data collected at federal, state, and local levels, and more data leads to more effective practices. Importance for Program Planning and Evaluation: Planning, setting goals, and evaluating programs require extensive data analysis. Data uncover less visible health problems, guiding targeted interventions and improving public health outcomes. Assessment as a Core Function: The Institute of Medicine identifies assessment as a core function of public health, stressing the importance of continuous data review for proactive interventions. Example: Unintended Pregnancies and Low-Birth-Weight Babies: Local data analysis (e.g., birth certificates) can reveal issues like unintended pregnancies or low-birth-weight babies, prompting public health officials to consider school-based birth control education and services. Public Health informatics- has vastly improved the accessibility of public health information to public health workers and the general public. Confidentiality of Information Collected by Governments Any information collected from individuals by governments is considered confidential and cannot be disclosed without the individual's consent. Data Handling: ○ Information is typically entered into a massive database. ○ Personal identifiers (names, addresses) are usually removed to protect individual privacy. For Research Purposes: ○ An identifying number may remain attached to the data. ○ This number helps researchers match information from one database to another (e.g., birth and death records). ○ This technique is vital in understanding public health trends, such as the factors contributing to infant mortality. Potential Privacy Breach: There is concern about internal breaches where employees within an agency or those who know an employee could access and misuse confidential information to harm individuals. Agencies handling confidential data impose strict rules for data access. Researchers must: ○ Explain and justify their need for the data. ○ Promise to safeguard confidentiality. Institutional Review Boards (IRB) or Data Protection Committees: ○ Review the researcher’s claims. ○ Decide whether to grant access based on community standards and data protection rules. HIV Confidentiality Concerns: Fear of discrimination led to worries that individuals would avoid testing without privacy protections. Special Handling of HIV Data: ○ Anonymous testing is allowed. ○ Modified reporting to state health departments and CDC to maintain anonymity. Shift in Policy: With new HIV treatments, reporting is now handled like other communicable diseases, with fewer exceptions to confidentiality. 4.1 Lecture Video: Definition of Public Health Surveillance: ○ The continuous and standardized collection, analysis, and interpretation of health data, with a prompt presentation to those responsible for public health prevention and control Stems from a French word meaning “to watch over” ○ The goal is to provide surveillance information to public health personnel, physicians, government officials, and the public ○ Turns data into action What is Surveillance Used For? ○ Detecting epidemics ○ Estimating the range of health issues ○ Characterizing diseases ○ Identifying patients ○ Monitoring environmental agents ○ Assessing program effectiveness What are Stringent Definitions? ○ Universally agreed-upon definitions of terms Necessary to effectively use surveillance data Allows biostatisticians to compare their work with other research The Surveillance Process: 1. Data collection ○ Ex: The U.S. Census, FLHealthCharts, and birth certificates 2. Data analysis 3. Data interpretation 4. Data dissemination ○ Ex: Medical journals, press releases, and social media 5. Call to action ○ Ex: National Health and Nutrition Examination Survey (NHANES) Figure 2: The Surveillance Process Source: Shapiro, Jerne. (n.d.) Public Health Data [Online Lecture]. https://mediasite.video.ufl.edu/Mediasite/Play/691d9bce2d59428baa5a3ddf9574b4821d