Introduction to Basic Concepts in Statistics

Document Details

CelebratedSunstone427

Uploaded by CelebratedSunstone427

University of Medical Sciences and Technology (UMST)

Tags

statistics data analysis biostatistics probability

Summary

This document introduces basic concepts in statistics, including descriptive and inferential statistics, data types (categorical and numerical), and measurement scales. It covers key definitions such as population, sample, parameter, and statistic, exploring data sources and providing exercises to apply the concepts. The document is a useful aid in understanding the foundation of statistics.

Full Transcript

**Biostatistics Module** **Unit 1: Introduction to Basic Concepts** **Statistics** - The term statistics, derived from the word state, was used to refer to a collection of facts of interest to the state. - **Statistics can be defined as**: - A way to get information from data. It...

**Biostatistics Module** **Unit 1: Introduction to Basic Concepts** **Statistics** - The term statistics, derived from the word state, was used to refer to a collection of facts of interest to the state. - **Statistics can be defined as**: - A way to get information from data. It is concerned with the collection of data, its subsequent description, and its analysis, which often leads to the drawing of conclusions. - It is a theory of decision making. **Branches of Statistics** 1. **Descriptive Statistics** - Deals with methods for organizing, summarizing, and presenting data in a convenient and usable form. - Uses graphical techniques to present data. - Uses numerical techniques such as measures of central location (mean, mode, median) and measures of variability (range, variance, standard deviation) to summarize data. - Steps: Collect data (e.g., Survey), Present data (e.g., Tables and graphs). 2. **Inferential Statistics** - Consists of methods for drawing conclusions about characteristics of a population based on information contained in a sample. - For conclusions to be correct, we build a measure of reliability into the statistical inference (confidence level and significance level). - Sample results can be used to make estimates and test hypotheses about population characteristics. **The Role of Probability** - In inferential statistics, we build a statistical model (margin of errors, confidence interval, level of significance) to provide a statement of quality. - Probability and sampling theory are gathered, then descriptive statistics are computed from which inferences will be based. **Why Statistics?** - To know how to properly present information. - To draw conclusions about populations based on sample information. - To improve processes. - To obtain reliable forecasts. - To provide input to study. - To measure performance of service or production process. - To evaluate conformance to standards. - To assist in formulating alternative courses of action. - To satisfy curiosity. **Key Definitions** - **Population**: The collection of things under consideration. - **Sample**: A portion of the population selected for analysis. - **Parameter**: A value that tells you something about a population. - **Statistic**: Tells you something about a small part of the population. - **Variable**: A characteristic of interest for the elements of a population or a sample. - **Data**: Observed values of a variable, collected, summarized, and interpreted. - **Time series data**: Data set containing observations on a single phenomenon observed over multiple successive points in time. - **Cross-sectional data**: Data set containing observations on multiple phenomena observed at a single point in time. **Data Sources** - **Census**: Complete enumeration of an entire population of statistical units. - **Advantages**: Provides the most reliable statistics if done professionally. - **Disadvantages**: Costly and timeliness is low. - **Surveys**: Based on a scientifically selected random sample from a population. - **Advantages**: More up-to-date, less time-consuming, and less costly than a census. - **Disadvantages**: Requires prompt data processing. - **Experimental Studies**: Identify variables of interest and control factors to obtain data. - **Observational (Non-experimental)**: Make no attempt to control or influence the variables of interest. - **Existing Sources**: Data might already exist within a firm or from government agencies. **Types of Data** - **Categorical (Qualitative)**: Labels or names used to identify an attribute of each element. - **Examples**: Gender, color, sex. - **Numerical (Quantitative)**: Require numeric values indicating how many or how much. - **Discrete Data**: Values that are isolated (e.g., number of children). - **Continuous Data**: No separation between possible values (e.g., test scores). **Measurement Scales of Data** - **Nominal Scale**: Data consist of labels or names without meaningful order. - **Examples**: Gender, religion. - **Ordinal Scale**: Exhibits properties of nominal data with meaningful order. - **Examples**: Education level, professional rank. - **Interval Scale**: Data have properties of ordinal data with meaningful intervals. - **Example**: Temperature. - **Ratio Scale**: Data have all properties of interval data with a meaningful zero point. - **Examples**: Distance, height, weight, time. **Exercises** 1. A psychologist has interviewed 250 out of 1507 school children throughout New York State and found that 80% spend at least 25 hours a week watching television. - a\) Identify the population and the sample. - b\) Identify the population parameter and the sample statistic of interest. 2. Determine whether the data type is quantitative, qualitative, or ranked. - a\) The weekly level of the prime interest rate during the past year. - b\) The name of the car driven by executives. - c\) The number of contacts made by each salesperson during a week. - d\) The rating (excellent, good, fair, or poor) given to performance evaluation.