Chapter 1 - Terminologies PDF
Document Details
Uploaded by Deleted User
Mathematics Department
Dr. Tharu
Tags
Summary
This document introduces terminologies and sampling methods in mathematics. It defines key statistical concepts like population and sample. It also discusses concepts like descriptive and inferential statistics. The introduction focuses on the fundamentals of sampling methods for data analysis.
Full Transcript
Chapter 1 - Terminologies Dr. Tharu Mathematics Department Contents Introduction................................................. 1 Terminologies.......................
Chapter 1 - Terminologies Dr. Tharu Mathematics Department Contents Introduction................................................. 1 Terminologies................................................ 1 Sampling.................................................. 2 Introduction Statistics is the science of gathering, organizing, analyzing, and interpreting data (information). Statistics is the art of decision making in the presence of uncertainty. Biostatistics: Data analyzed are derived from biological sciences and medicine Terminologies A population, of size N, is a group of individual persons, objects, or items that one wishes to better understand certain characteristics about and from which samples are taken for statistical measurement. – The population data is the complete collection of information from all of the individuals or subjects of interest in a given study. – A census is a survey of every individual in the population and the information gathered is called the population data. A sample, of size n, is a partial collection of information for only some of the individuals or subjects of interest in a given study. – The sample size, n, of a sample is the number of observations that constitute the sample. Whereas the size of a population is denoted by the capital letter N, the size of the sample is denoted by the lowercase letter n. A parameter is a numerical measure that describes the outlined characteristic of the population such as central tendencies (mean, median, mode, and proportion), spread (range, variance, and standard deviation), and shape (symmetric and skewed). A statistic is a numerical measure that yields an estimate of a population parameter. That is, a numerical measure that uses the data from the sample to estimate the outlined characteristic of the population. 1 Descriptive Statistics involve methods of organizing and summarizing information (data) and presenting it numerically or visually (graphically). 1. Stem-and-leaf 2. Frequency Tables 3. Contingency Tables 4. Bar Charts 5. Pie Charts, and Histograms 6. Scatter Plots Inferential Statistics involves methods of analyzing and interpreting descriptive statistics to draw conclusions regarding a particular characteristic in the population with a certain degree of assurance based on a preset level of significance and specified assumptions. 1. Confidence interval 2. Hypothesis testing Individuals are the people, places, or things included in a study and for which information is gathered. In medical research studies, individuals are referred as the subjects in the study. A variable is a distinct characteristic of an individual to be observed and/or measured. These observed data can be qualitative or quantitative. 1. Qualitative variables (Categorical variables) A variable that describes the individual by placing the individual into a category or group. 2. Quantitative variables (Numerical variable) A variable that takes on a real value or numerical measurement for which operations such as sums, differences, and ratios have meaning. Random variable – Discrete random variable ∗ The set of possible observed outcomes are separate, distinct, and finite such as a count. ∗ The outcomes can be enumerated: one, two, three, etc. – Continuous random variable ∗ The set of possible observed outcomes are infinite and uncountable. ∗ They are dense; that is, between any two values (outcomes) there exist another value (outcome) Sampling Simple random sample – Every possible sample of size n has the same chance of being selected as a sample from population is called simple random sample. – Each individual has an equally likely chance of being selected as well as – All groups of size n have an equally likely chance of being selected. – Toss a coin sample(c("H","T"),10,replace=TRUE) "H" "T" "T" "H" "T" "T" "T" "H" "H" "T" Roll a die sample(1:6,10,replace=TRUE) 6 3 6 6 4 3 6 5 2 3 2 Cluster sampling – Groups are selected based on pre-existing groups that is arbitrary to the individual and not based on any characteristic of the individual. ∗ In the country, by region ∗ In the state, by zip code ∗ In the state or nation, by area code Systematic sampling – Every kth individual or item is measured or selected. ∗ Every 3rd: 1, not 2, not 3, 4, not 5, not 6, 7 ∗ Every 5th: 1, 6, 11,16,... or 2, 7, 12, 17,... or 5,10,15,20.... Etc. Stratified sampling – Individuals are first grouped by specific characteristics such as gender and then samples are taken from each group or strata. ∗ Individuals grouped by gender ∗ Individuals grouped by age ∗ Individuals grouped by race Convenient Sampling Researcher collects sample data based on his/her own convenient Multistage Sampling – More than one sampling techniques have been considered – Used in complex studies 3