PREP 4b Intro to Applied Stats - Defining the Data PDF

**[Core Principles in Mental Health Research ]** **[PREP: 4b Intro to Applied Stats -- Defining the data ]** [Learning Objectives:] - You will be able to describe the difference between populations and samples and comment on statistical inference - You will be able to comment on the different types of data and variables that we can create using the data from our samples [Note on Terminology & Approach:] - We focus here on frequentist statistics - The main alternative approach is Bayesian statistics **Epidemiology & medical statistics** **Psychology** **Social science** --------------------------------------- ---------------------- ---------------------------------- Exposure variable Independent variable Explanatory / predictor variable Outcome variable Dependent variable Response variable [Defining Exposure & Outcome:] - The outcome variable is the variable that is often the focus of our attention, whose variation or occurrence we are seeking to investigate and understand - E.G. depression; eating disorders; psychosis; bipolar - We are often interested in identifying risk factors or exposures that may influence the occurrence or severity of the outcome - The purpose of a statistical analysis is often to quantify the magnitude of the association between one or more exposure variables and the outcome variable [Population & Samples:] - In research studies, we collect data on a sample from a much larger group called the population - The sample is of interest not in its own right but for what it tells us about the population - Statistics allows us to use the sample to make inferences about the population from which it was derived - Because of chance, different samples from the population will give different results, and this must be considered when using a sample to make inferences about the population - The concept of sampling variation is at the heart of frequentist statistics and will be explained in the interpreting statistics lecture of the core module [Specify the Target Population:] - In any research study, it is important to carefully and precisely specify the target population - Care should also be taken to ensure that the sample represents the target population - The researcher may take a random sample of university students to test her hypothesis - What are the potential problems with her approach? [Sampling From the Target Population:] - If students differ from other young people in any way that affects their experiences of loneliness or depression (exposure and outcome), the sample and the finding may not represent the population - The finding will not be generalizable and will apply only to the population of UK university students [Types of data 1:] - The raw data from a research study consist of observations made on individuals - The number of individuals is called the sample size - Any aspect of an individual that is measured - for example - their depressive symptoms, exposure to loneliness, age, gender or highest educational qualification is called a variable - A first step in choosing how best to display and analyse data is to classify the variables into their different types - This is important because the choice of statistical test to use depends on the nature of the outcome (i.e. how the outcome variable is classified) - The main division is between numerical (quantitative) variables, categorical (qualitative) variables and rates [Outcomes] [1. Numerical Variables: ] - A numerical variable is either continuous or discrete - A continuous measurement can take on any number within the possible / plausible range E.G. BMI (26.42, 28.35) - A discrete variable can only take on certain scores (whole numbers) such as the number of depressive episodes in 10 years (0, 2, 3, 4, 12) - Note: Often, variables that are technically discrete are described as continuous and continuous is often used to mean numerical [Categorical Variables:] - A categorical variable assigns people to one of two or more qualitatively distinct categories (E.G. 1, 2, 3) - A binary variable is categorical variable with only two categories E.G. clinical diagnoses (diagnosed with schizophrenia or not, 0 or 1) - An ordered categorical variable assigns people to ordered categories E.G. socioeconomic status: low / middle / high - Nominal categorical variables assign people to categories with no underlying order E.G. eye colour. [2. Rates:] - Rates of disease are measured in longitudinal studies and are the fundamental measure of the frequency of occurrence of events (such as illness or death) over time - For example, 30-year mortality rates among adults with depression - The rate of occurrence of psychosis in the Swedish population

PREP 4b Intro to Applied Stats - Defining the Data PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue