Political Science Final Study Guide PDF
Document Details
Uploaded by RealisticCalcium
Tags
Related
- Research Methodology in Political Science PDF
- Essentials Of Comparative Politics Lecture Slides PDF
- Qualitative Methodology and Comparative Politics PDF
- Grundwissen Politik Band 43 PDF
- Research Design in Political Science PDF
- Quantitative Research Methods in Political Science Lecture 12 (12/05/2024) PDF
Summary
This document appears to be a study guide for a political science course, covering various research methods and concepts like alternative hypothesis, variables, randomized experiments, and different types of sampling. The guide also touches on how to generate, test, and validate research findings, including the critical role of operational definitions in quantifying concepts.
Full Transcript
**Alternative Hypothesis** Should lead to spuriousness - what happens when your hypothesis isn't correct - what else is causing the IV and DV - ruled out by qualitative research because of hypothesis testing using comparative method - does not support the theory **Antecedent Variable** something...
**Alternative Hypothesis** Should lead to spuriousness - what happens when your hypothesis isn't correct - what else is causing the IV and DV - ruled out by qualitative research because of hypothesis testing using comparative method - does not support the theory **Antecedent Variable** something that comes before the independent variable that leads to dependent variable (cannot understand IV and DV without the AV) -an antecedent variable that could help explain (or partially explain) the relationship between (age and annual income) that should be considered is *education level*, since this tends to have a correlation with both age and income - an independent variable **Arrow Diagram** A pictorial representation of a researchers explanatory scheme connecting variables or alternative variables - shows how variables connect w/ arrows - hypothesised causal relationship **Bar chart** A graphical display of the data in a frequency or percentage distribution - height = the proportion or percentage of observations in each category - values relatively small number of cases large **Case Study** A comprehensive and in depth study of a single case or several cases - a nonexperimental design in which the investigator has little control over the events --comprehensive and in depth, typically **qualitative**. - - - **Categorical Measure** Nominal or ordinal level measure **Causes of Effects Approach** Starts with an outcome and works backwards to the causes - identifying the causes of the outcomes \"what caused this particular result to happen?\" qualitative is better at this **Central Tendency** The most frequent, middle or central value in a frequency distribution - mean, median and mode **Classical Randomized Experiment** An experiment with the random assignment of subjects to experimental and control groups with a pretest and posttest for both groups - treatment and control - stimulus is important, differences in measurement of dependent variable - no differences other than stimulus between the to groups (covariation, time order, elimination of possible alternatives) **Closed-Ended Questions** A question with response alternatives provided - A closed-ended question is a question where the respondent is given a set of predefined answer choices to select from, limiting their response to the options provided; examples include: \"Did you enjoy the movie? (Yes/No)\", \"What is your favorite color? (Red, Blue, Green)\", or \"On a scale of 1-5, how satisfied are you with the service?\" **Cluster Sample** A probability sample that is used when no list of elements exists - the sampling frame initially consists of clusters of element - sample of the sample and confirms the requirements of probability sample - you do not need to have a complete sampling frame **Confidence Interval** The range of values into which a population parameter is likely to fall for a given level of confidence - estimate of a range in which the population mean will likely be found - the mean of your estimate plus and minus the variation in that estimate - Because the true population mean is unknown, this range describes possible values that the mean could be - higher confidence levels correspond to wider confidence intervals, and lower confidence level intervals are narrower **Confidence Level** How much confidence in the confidence interval - 90 percent or higher z score of 1.96 represents 95% confidence - the degree of belief or probability that an estimated range of values includes (or doesn't include) the true population parameters **Constant** A concept or variable which values do not vary **Construct Validity** Validity demonstrated as measure is more valid when it shows that it is related to the measure of another concept - correlated with other measures of the same thing **Content Analysis** A systematic procedure by which records are transformed into quantitative data Systematically Analyze (the content of the text) - - - **Content Validity** Determining the full domain or meaning of a particular concept and then making sure that all components of the meaning are included in the measure - captures all aspects of a concept - Does the scope of the measure capture all of the things we think are a part of the concept? **Control Group** A group of subjects that does not receive the experimental testament or test stimulus **Convenience Sample** Non Probability sample in which the selection of elements is determined by the researches convenience - choose cases that are easy to access, requires few resources but can lead to a biased sample - used in both large and small n research (snowball sample is a particular convenience sample in which some respondents recommend other people you could talk to) **Correlation** A statement that the values or states of one thing systematically vary with the values or states of another an association between two variables **Cross Sectional Design** Research design in which measurements of independent and dependent variables are taken at the same time; natural occurring differences in independent variable are used to create quasi-experimental and quasi-control groups; extraneous factors are controlled for by statistical means - non manipulation of IV assignment of subjects to treatment or control or conditions under IV is experienced **Cumulative Knowledge** New substantive findings and research techniques are built upon those in previous studies **Data Matrix (Data Set)** An array of arrows and columns that stores the values of a set of variables for all the cases in a data set - rows hold the specific values pertaining to that case - each column contains the values of a single variable for all the cases - column headings list variable names **Dependent Variable** The phenomenon thought to be influenced, affected or caused by another variable - a phenomenon that we want to explain (the effect) or y on the graph - second variable in a hypothesis formatted in class **Descriptive Statistic** A number that(because of its definition and formula) describes a certain characteristics or properties of a batch of numbers **Deviant Case** A case that exhibits all the factors thought to lead to a particular outcome but in which the outcome does not occur - (the case does not conform to the expectations of the theory, most informative when looking to develop/ refine a new theory) **Dichotomous Variable** A variable with only two categories - nominal, ordinal or ratio level - there are exactly two values that the variable can take (0 and 1) **Direct Observation** Actual observation of behavior by the researcher field or lab - behavior itself **Dispersion** Distribution of data values around the most frequent, middle or central value - differences among the units of the variable - variability amongst cases - exact amount of variation in a variable 1. 2. 3. **Disproportionate Sample** A stratified sample in which elements sharing a characteristic are underrepresented or overrepresented in the sample **Document Analysis** The use of audio, visual or written materials as a source of data **Double-Barreled Questions** A question that is really two questions in one \"How would you rate the quality of our products and our customer service?" **Ecological Fallacy** deducing a false relationship between the attributes or behavior of individuals based on observing that relationship for groups to which the individuals belong - relationship at a group/aggregate level cannot be used to describe individual-level processes -people act different individually then in groups **Effects of Causes Approach** An approach to causal questions that starts with a potential cause and works forward to measure its impact on the outcome - emphasis is on measuring the size of the effect that a cause has on an outcome - quantitative is better at this - "what are the likely results if we implement this particular action?" **Electronic Databases** A collection of information fo any type that is stored on a eclectic medium that can be accessed and examined by certain computer programs - helped move political science research faster - we use them through BYU library **Elite Interviewing** Interviewing individuals who possess specialized knowledge about a political phenomenon - information the researcher does not have or first hand information - be prepared, present nicely, record interviews, not too many interview, preparation **Empirical Research** Research based on actual "objective" observation of phenomena **Empiricism** Relying on observation to verify propositions - though objective observation, experimentation and logical reasoning - providing tangible evidence - distinguishes from mythical knowledge (falsifiability, nonnormative, transmissible) - statements supported by empirical facts are more likely to be true **Episodic Records** Materials that are not part of a systematic and ongoing record-keeping effort - are produced and preserved in a more casual, personal, and accidental manner like personal diaries and correspondence, or even brochures and pamphlets **Ethnography** A type of field study in which the researcher is deeply immersed in the place and lives of the people benign studied - first hand observation - cultural interpretation through personal observation of everyday life **Experiment** Research using a research design in which the researcher controls exposure to the test factor of independent variable, the assignment of subjects to groups and the measurement of the responses **Experimental Effect** Effect, usually measured numerically, of the experimental variable on the dependent variable **Explanatory Knowledge** Signifying that a conclusion can be derived from a set of general propositions and specific initial consideration providing a systematic empirically verified understating of why a phenomenon occurs as it does - characteristic of scientific knowledge **External Validity** The ability to generalize from one set of research findings to other situations - how well can we take what we\'ve learned and apply it to other cases - generability **Face Validity** When a measure appears to accurately measure the concept it is supposed to measure - may only be asserted rather than empirically demonstrated, because face validity is a matter of judgment - the measure seems to capture the concept (just by looking at it) - face value **Falisabilty** A property of a statement or hypothesis such that it can (in principle at least) be rejected in the face of contravening evidence - admits logical counterexamples for which you could find physical evidence (non-falsifiable can still be true and important) - Conspiracy theories are NOT falsifiable - in principle there is a way to prove it wrong **Field Experiment** Experimental designs applied in a natural setting - daily life - people participate in the experiment as part of their normal daily lives (usually after consenting to be research subjects) - a downside is they are usually not cost friendly or actually feasible to conduct logistically (174) **Frequency Distribution** The number of observations per category of a variable (213 table) **General Knowledge** A characteristic of scientific knowledge is that it is applicable to many rather than just a few cases **Histogram** A type of bar graph in which the height and area of the bars are proportional to the frequencies in each category of a categorical variable or internals of a continuous variable - nominal/ordinal proportion of cases in each class - continuous (age, income) divide variable into intervals, counting cases per interval and then drawing bars for port=potion in each interval - (spread of values and average magnitude) **Hypothesis** A tentative/ unconfirmed statement that in principle can be verified - a testable statement of the relationship between the independent and dependent variables based on a theory - A hypothesis is an observable implication of a theory - Good hypothesis: Empirical, general (applicable to multiple cases), plausible (a good theory helps), specifies a relationship (independent/dependent variable and relationship is directional + or -), testable (data can be gathered not tautological), consistent with the data the researcher intends to use - Hypothesis is more specific than a theory because it uses variables not concepts - specific to a group or context (unit of analysis) - FORMULA - In a comparison of (units of analysis), those having (one value on the independent variable) will be more likely to have (one value on the dependent variable) than will those having (a different value on the independent variable) **Hypothesis Generating Case Study** A type of case study that attempts to develop from one or more cases some general theoretical propositions that can be tested in future research - observe something interesting happening in world and use that to make generalizations about the world - **Hypothesis Testing Case Studies** Testing hypothesized empirical relationships - investigations of causal mechanisms, application of case study methods contributing to cases study research - when you go in with a hypothesis and find case studies that match that and use causal logic **Idiographic Case Study** Attempts to describe explain or interpret a singular historical episode with no intention of generalizing beyond the case - one event and not generalizable - is an in depth analysis of a small number of cases - Inductive (lack a theoretical perspective and are just focused on describing) - Theory Guided (are explicitly structured by a well-developed conceptual framework that focuses on some theoretically specified aspects of reality and reflects others - not testing hypothesis purely analytical) **Independent Variable** the phenomenon thought to influence, affect, or cause some other phenomenon - a factor that we think does the explaining (the cause) x on a graph - comes first in the hypothesis formula **Indirect Observation** Observation of physical traces of behavior - still firsthand observation **Informed Consent** Procedures that inform potential research subjects about the proposed research in which they are benign asked to participate; the principle that researchers must obtain the freely given consent of human subjects before they participate in the research project **Institutional Review Board** Panel to which researchers must submit descriptions of proposed research involving human subjects for the purpose of ethics review **Internal Validity** The ability to show that manipulation or variation of the independent variable actually causes the dependent variable to change - thinking about cases in the study = confidence in our causal inference that x-y... is there another reason - any form of bias hinders this - IV ACTUALLY is causing the change in the DV **Interval-Level-Measure** Includes the properties of nominal level (characteristics are different) and ordinal level (characteristics can be put in a meaningful order) but unlike nominal and ordinal measures the intervals between the categories or values assigned to the observation do have meaning - the precise number you use has value - distance between numbers has meaning BUT the value of 'zero' does not mean the absence of the value - temperature (Farenheit), temperature (Celcius), pH, SAT score (200-800), credit score (300-850). **Intervening Variable** A variable that comes between an independent and dependent variable in an explanatory scheme - Your causal logic might help you come up with an intervening variable (X to Y to Z) - something that comes between the Independent and dependent variable (IV something comes between before leading to the DV) - do not cause alternative but they can influence the formulation of an alternative hypothesis by introducing a potential mediating mechanism between the independent and dependent variables, essentially explaining how the independent variable might affect the dependent variable through the intervening variable **Interview Data** Data that is collected from responses to questions posed by the researcher to the respondent - surveying or face to face - highly structured or less structured discussion - responder knows they are being recorded **Large-N Studies** Quantitative research designs in which the research examines many cases of a phenomenon - there are lots of units analysis - more data we can collect = less bias - tiny data with the mean does not actually affect the real world but a ton of data points excludes bias and actually tells us what\'s going on in the world **Leading Questions** A question that encourages the respondent to choose a particular response - \"You enjoyed the party, didn\'t you?\" which implies the person had a good time and pushes them towards a positive response; **Least Likely Case (Hard Case)** A case in which it is expected that a theory is least likely to apply - the case which is least likely to confirm the hypothesis, most informative when it confirms the hypothesis in spite of difficulty of the case **Level of Measurement** Refers to the type of informations that we think our measurements contain and the mathematical properties they posses - determines the type of comparisons that can be made across a number of observations on the same variable **-** is important because it dictates what statistical procedures we can use with that variable - (ratio, interval, ordinal, nominal, dichotomous) **Linear Regression** We are mostly resting to see if the slope of the line b, that relates these two variables is significantly different than 0 - The most common method of statistical analysis in political science - Key statistic: regression coefficient denoted by or (beta hat) - Existence: is the b value b 0? Is p less than.05 - direction : positive or negative sign on the coefficient indicates the direction of the relationship - Strength: we use a value called r squares, but you don\'t need to worry for this class - Size: the value of b is the slope of the line bigger values indicate a steeper slope or a bigger relationship - We are looking at the correlation between two variables - plot values on scatterplot - can calculate the correlation coefficient (relationship + or -) - one explanatory variable **Literature Review** A systematic examination and interpretation of the literature for the purpose of informing further work on a topic - for you, situate yourself within context of the knowledge, communicate to the reader or audience Orient someone into the academic conversation - Continue the conversation- Do not kill the conversation (with minute details, play by play review or ignoring others) - analysis is most important not summary **Mean** The sum of the values of a variable divided by the number of values - (average) best for interval/ratio level variables - do not use for nominal level variables **Measurement Bias** A type of measurement error that results in systematically over or underestimating the value of a concept - (validity) - the way the unit is measured leads to an inaccurate result, reduces the internal validity of the finding **Median** The category or value above/ below which one-half of the observations lie - use for ordinal or interval variables especially if there are outliers - do not use for nominal level variables **Method of Agreement** A comparative strategy in wherein the researchers select cases that share the same outcome and identifies those conditions or causal factors that the cases also have in common - cases different from one another and only substantially similar to the IV and DV the researcher is interested in studying - inverse of an experiment - are there other cases that are same on IV but not the DV? - the cases are VERY different but only similar in the IV and DV - usually generation of hypothesis and theoretical propositions that can be tested using more cases **Method of Difference** A comparative strategy wherein the researcher selects cases in which the outcomes differ compares the cases looking for the single factor that the cases do not have in common and concluded that this factor is casual - cases similar to each other and only differ substantially on the value of the independent variable that the researcher is interested in studying - meant to mimic an experiment - the cases are extremely similar to one another ONLY difference is the IV variable - should mimic an experiment - usually hypothesis - generating **Mode** The category with the greatest frequency of observations **-** best for nominal variables - not very useful for interval/ratio level variables **Most likely Case (Easy Case)** A case in which theory predicts an outcome is most likely to occur - most likely to confirm hypothesis, most informative when all the conditions IV are in place the outcome is not what is expected therefore disproves the hypothesis **Natural Experiment** A study in which there is random assignment or "as-if" random assignment of units to experimental and control groups but the researcher does not control the randomization process or manipulation of the treatment factors - high levels of internal validity (no control in the manipulation or application of treatment variable) - possible that factors in environment influenced dependent variable (lessening internal validity) - high degree of external validity because they take palace in the real world **Negative Relationship** A relationship in which high values of one variable are associated with low values of another variable or vice versa - more IV = less DV or less IV = more DV (opposite sign variables) **Nominal-Level Measure** Indicates that the values assigned to a variable represent only different categories or classifications for that variable - there are two or more discrete values for the measure - each value represents a separate category - the categories do not have any inherent logical order- gender, race, eye color, blood type, hair color, nationality, marital status, religion, political party affiliation, and shirt size **Non-Participant Observation** Observation of activities behaviors or events in which the researcher does not participate - usually not long lasting **Non-Probability Sample** A sample for which each element in the total population has an unknown probability of being selected - a sample in which some units for any reason were more likely to end up in the study that other units (aka: convenience sample) - - - **Normal Distribution** A distribution defined by a mathematical formula and the graph which has a symmetrical bell shape in which the mean, mode and median coincide and in which a fixed proportion of observations lies between the man and any distance from the mean measured in terms of the standard deviation - bell shaped, roughly symmetric, indicates that observations are concentrated around the center we say that they are normally distributed **Normative Knowledge/ Questions** Knowledge that is evaluative, value-laden and concerned with prescribing what ought to be - Political theory engages in "normative" political science - meaning it is concerned with the value judgments about what should be best - we want to stray away from this **Null Hypothesis** A statement that a population parameter equals a single or specific value the hypothesis that there is no relationship between two variables in the target population - a statement that the difference between two populations is zero - our default assumption that we want to disprove (usually no relationship between IV DV) - We ONLY test the null hypothesis (options are reject or fail to reject null hypothesis - rejecting provides evidence that supports our research hypothesis) - you are trying to disprove it - Minimum standard level for rejecting the null hypothesis - 0.5 level of significance - we can reject the the null hypothesis if it is lower than.05 **Observational Study** A nonexperimental research design in which the researcher simply observes differences in the dependent variable for naturally occurring treatment and control groups - does not manipulate experimental variables or randomly assign subjects to treatment but just observes casual sequences and covariations - no control over environment - quantitative **Open Ended Questions** A question with no response alternatives provided for the respondent \"What do you think about\...\", \"How did you feel when\...\", \"Why do you think that?\", \"What are your thoughts on\...\", \"Can you describe your experience with\...\", **Operational Definitions** The rules by which a concept is measured and scores assigned - Operational definitions are used to standardize data collection and ensure that measurements are accurate and precise. They are necessary because different people may have different interpretations of the same concept **Operationalization** The process of assigning numerals or scores to variables to represent the values of a concept - involves identifying indicators that can be used to identify when a variable is present and to measure its magnitude - How exactly are you going to observe or measure your independent and dependent variables? - Be incredibly specific (is someone else pick up your description they would all come to the same conclusions you do about the value of the variable for every observations) NOT "you know it when you see it" - Operationalization should be: Vaid - measuring what you want it to measure / Reliable - doesn\'t vary in results everyone will always get the same results **Ordinal-Level Measure** Indicates that the values assign to a variable can be compared in terms of having more or less of a particular attribute - there are two or more discrete values for the measure - each value represents a separate category - the categories have a logical order (higher category means something, but the number itself is not meaningful) - \"education level\" (high school, bachelor\'s, master\'s), \"satisfaction rating\" (very dissatisfied, dissatisfied, neutral, satisfied, very satisfied), \"clothing size\" (small, medium, large), and \"Likert scale\" responses **Outlier** A value that is far greater or smaller that other values of the recorded variable **Parsimony** The principle that among explanations or theories with equal degrees of confirmation the simplest (the one based on the fewest assumptions and explanatory factors) is to be preferred - ockham\'s razor vibe - the idea that we need to keep imperial studies as pure as possible - test one thing at a time - keep it simple - characteristic of empirical scientific knowledge - parsimonious (we are going to take the most simple answer) - look at specific characteristics **Participant Observation** Observation in which the observer becomes a regular participant in those being observed - actor and spectator **Peer Review** A vetting process in which multiple anonymous viewers (also experts)... Send to a journal to publish editor send to multiple experts to see what they think it goes back and forth revisions until it can be published or scrapped **Population** All the cases or observations covered by a hypothesis all the unit of analysis to which a hypothesis applies - we use a sample to make inferences about a population **Population Parameter** A characteristic or attribute in a population (not a sample) that can be quantified - True population parameter - mean of the kind of data we would receive from a whole population we would never know that value - the population size (N) is one parameter, and the mean diastolic blood pressure or the mean body weight of a population would be other parameters that relate to continuous variables **Positive Relationship** A relationship in which the value of one variable increase or decrease as the values of another variable increase or decrease - more IV = more DV and less IV = less DV (same sign for both variables) **Precision** The extent to which the measurements are complete and informative - variables can have different 'levels of measurement' that refers to how many values the measure can take and the information we get from those values - higher level of precision (more information) is preferred - level of measurement is important because it dictates what statistical procedures we can use with that variable **Pretest** Measurement of variables prior to the administration of the experimental treatment or manipulation of the independent variable - measuring causation - usually a questionnaire **Primary Data** Data recorded and used by the researcher who is making the observations - an original document or object that provides first-hand information about an event or topic **Probability Sample** A sample for which each element in the total population has a known probability of benign selected - a sample in which every unit in the population has an equal probability of being included (aka: random sample) - - - **Process Tracking** A case study in which a causal mechanism is traced from causal condition to final outcome - examine every tiny detail - who, what, when, where, why - Still a scientific process and can be used for hypothesis generation or hypothesis testing - explicitly unpack mechanisms and engage in detailed empirical tracing of them - if an explanation were true, what would be the specific process leading to the outcome? - often only one case - detective shifting through evidence in order to solve a mystery - - - - **Proportionate Sample** A probability sample that draws elements from a stratified population at a rate proportional to the samples - the groups or strata are all equally represented in proportion to its size in the full population (groups are chosen based on their proportion in their population, different strata have different numbers of people and so end up with proportionality more of less in the sample) **P-Value** The probability of seeing your observed data (or something even more extreme) if the \"null hypothesis\" (the default assumption being tested) is actually true; a low p-value indicates that your results are unlikely to have happened by chance, suggesting strong evidence against the null hypothesis, while a high p-value means your data is more likely to be consistent with the null hypothesis being true - - **Qualitative Research** Gathering observations that are not numbers, gather detailed in depth data, small N meaning one of few cases, researchers expertise to interpret and get a complete picture of the phenomenon - trace temporal sequencing of effects, better at causes-of-effects, rule out alternative hypothesis by evaluating all aspects of the case - outcomes are NOT measured in terms of numbers and statistically compared The goal of Qualitative research is to document the fall of the intermediate dominos - you pick cases with the tight first and last dominos and then look for the evidence of the dominos in the middle - - - **Quantitative Research** Observation to numbers to stats - gather data which are expressed as numbers, Large N meaning there are lots of units analysis, use statistical analysis to look at average patterns - clearly establish patterns and covariation or correlation, better at effects-of-causes, can use control variable to rule out alternative hypothesis Quantitative approaches typically examine the first and last domino is a chain - use a theory to describe the points in between (causal mechanism) - document the fall of the dominos Steps in quantitative data collection - - Ideally a random sample of a population (identify population and then the procedure for random selection) - might be a convenience sample - How do we go from concept to number, specific criteria (content analysis, expert evaluation), exact question wording (survey data), specific statistics to gather (administrative data) - - **Quasi Experimental Design** A research design that includes treatment and control groups to which individuals aren't assignments randomly - effects (if any) of treatments have to be inferred so not strong internal validity - turn to judgment and assumptions about how consistent the treatment variable was - must look for all other possible effects if IV are taken into account **Quota Sample** A nonprobability sample in which elements are sampled in proportion to their representation in the population - like a stratified sample, except with a convenience sample instead of a random sample **Randomization** The random assignment of subjects to experimental and control groups - randomization means we can rule out alternative explanations **Range** The distance between the highest and lowest values or the range of categories into which observations fall = highest value we observe minus lowest value **Ratio-level Measure** This type of measurement involves the full mathematical properties of numbers and contains the most possible information about a measured concept - Ratio level measurements indicate the absence of a value - the precise number you use has a value - distance between numbers has meaning - value of 'zero' indicates the absence of whatever is being measured - mean is best not mode and use a histogram **Reactivity** Effect of data collection or measurement on the phenomenon being measured?? **Relationship** The association dependance or covariance of the values of one variable with the values of another variable - good theories clarify this - positive negative or nonlinear **Reliability** The consistency of results from a procedure or measure in repeated tests or trials - can you replicate this study easily enough? - the consistency of results form a measure when repeatedly observed - - **Research Design** A plan of how specifying how the researcher intends to fulfill the goals of the study; a logical plan for testing the hypothesis **look into textbook** **Research Hypothesis** A statement about the value or values of a population parameter - a hypothesis proposed as an alternative hypothesis to the null hypothesis represented by Ha - Alternative hypothesis - (aka research hypothesis) the hypothesis that describes the relationship that we actually think exists - **Response Rate** The proportion of respondents selected for participation in a survey who actually participate **Running Record** Materials or data collected across time Administrative Data (from the running record) - - - **Sample** A subset of observations oe cases drawn form specified population - We use a sample to make inferences about a population = the units we actually observe and analyze - - **Sample Bias** The bias that occurs whenever some elements of a population are systematically excluded from the sample - usually due to an incomplete sampling frame or a non probability method of selecting elements - the units in the study are not reflective of the bigger population, reduces the generalizability (external validity) of the findings - systematic differences between your sample or sample frame and your population **Sampling Distribution** A theoretical (non observed) distribution of sample statistics calculated on samples of size N that if know permits the calculator of confidence intervals and the test of statistical hypotheses - if you were to take many random samples from the population you could plot the distribution of the sample statics and it would form a new distribution - **Sampling Error** The difference between a sample estimate and a corresponding population parameter that arises because only a proportion of a population is observed - Sampling/statistical uncertainty - random error that arises from sample instead of a whole population, causes uncertainty but not bias **Sampling Frame** The population from which a sample is drawn ideally it is the sma as the total population of interest to a study - a list of units from which you could draw your sample (may or may not match the full population) **Secondary Data** Data used by researchers that were not personally collected by that researcher **Simple Random Sample** A probability sample in which each element has an equal chance of being selected - you must have a sampling frame and randomly select units from the frame for inclusion **Skewed Distribution** - - - - - **Small N Studies** Research designs in which the research examines one or few cases if phenomenon in considerable detail - qualitative research **Spurious Relationship** describes two variables that move together but are affected by a third variable - can happen because we did not rule out alternative explanations/ correlation exists because of random chance Biggest challenge = how to rule out all the other possible explanations - to protect against changes of spurious correlation our inference must be theory-driven and we must consider alternative explanations also through setting but a good research design **Standard Deviation** A measure of dispersion of data points about the mean for interval and ratio level data = average distance between the midpoint of the sample (average) and an observation 1. \- average how far away is every observation from the mean **Statistical Inference** The mathematical theory and techniques for making conjectures about the unknown characteristics (parameters) of population based on samples **Statistical Significance** The probability of making a type 1 error - convention for testing hypothesis focusing on probability of making a type 1 error - most common is.05 (up to researcher to decide how great a chance of making a type 1 error is) - found particular value will occur by chance at most only 5 percent of the time if the null hypothesis is true **Stratified Sample** A probability sample in which elements sharing one or more characteristics are grouped and elements are selected from each selected from each group in a proportion to the groups representation in the total population - you select cases from strata (groups) within the full population (US states, genders, political parties, regions of the world) **Theory** **A statement or series of related statements that organize explain and predict phenomena -** a reasoned and precise speculation about the answer to a research question, including a statement about why the proposed answer is correct - - - - - - Thesis vs. Theory - what you think the causal relationship is and the rest of the paper will explain why that is - thesis one sentence claim as to what your theory states A theory is a general explanation of what relationship we expert to see and why - tell us where to look for evidence that they are right or wrong - hypothesis is an observable implication of a theory Theory section = layout causal logic and define key concepts **Transmissible Knowledge** Indicates that the methods used in making scientific discovered are made explicit so that others can analyze and replicate findings - characteristic of scientific knowledge **Treatment (experimental) Group** A treatment group, also known as an experimental group, is a group of subjects in a research study that receives a treatment or intervention that the researcher is interested in studying - the treatment is the independent variable that the experimenters manipulate **Trend Analysis** A research design that measures a dependent variable at different times and attempts to determine whether the level of the variable is changing and if it is why - plot appropriate measure of dependent variable at different times **Two-Sided Question** A question with two substantive alternatives provided for the respondent - \"Do you enjoy math class and do you think it\'s important?\" \"Do you think we should focus on improving engagement and job satisfaction at our company?\" \"Are you hungry or thirsty?\" \"Do you want coffee and breakfast? **Type 1 Error** Error made by rejecting a null hypothesis when it is true - alpha level is the level at which we accept or reject the null 0.05 - less than 5 percent not by chance and we reject the null - more than we cannot reject null **Type 2 Error** Error made by failing to reject a null hypothesis when it is not true **Uniform Distribution** - **Unit of Analysis** The type of actor (individual, group, institution, nation) specified in a researcher's hypothesis - Units of analysis (units going to analyze) What are you comparing in your hypothesis and the things you are going to gather data about? - - **Validity (of a measure)** Refers to the degree of correspondence between the measure and the concept it is thought to measure - Does it accurately capture the thing we care about? - - - **Verification** The process of confirming or establishing a statement with evidence **Z-Score** The number of standard deviations by which a score deviates from the mean score = number of standard deviations there are away from the mean - Because standard deviation is consistent we can calculate how far away observation is from the mean relative to other observations and make comparisons across distribution with different scales - measures individual points - how many standard deviations away is this specific point from the mean