Dr. Boshra Lectures: Statistics in Scientific Research (PDF)

Document Details

Uploaded by Deleted User

Dr. Boshra

Tags

statistics environmental science data analysis scientific research

Summary

These lecture notes, titled "Statistics as a Tool in Scientific Research," cover various topics related to data analysis and presentation, focusing on the importance of data quality, concepts of data and information, statistical methods, and system analysis. The lecture materials aim to equip participants with essential skills in summarizing and representing scientific data, emphasizing applications in environmental research.

Full Transcript

Presentation 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Contents Importance of Data Quality, Concepts of Data & Information, Statistical Methods, and System Analysis Principles for Solving Environmental Prob...

Presentation 1 Statistics as a Tool in Scientific Research: Summarizing & Graphically Representing Data Contents Importance of Data Quality, Concepts of Data & Information, Statistical Methods, and System Analysis Principles for Solving Environmental Problems Importance of Data Quality Definition: Data quality refers to the accuracy, consistency, completeness, and reliability of data used for decision-making. Why it Matters? Informed Decision-Making: High-quality data ensures decisions are based on reliable, accurate information. Accuracy of Models: Environmental models rely on precise data for predicting climate change, pollution levels, etc. Policy Formulation: Governments and organizations depend on accurate data to create effective environmental regulations. Key Dimensions of Data Quality Accuracy: Correct representation of real-world conditions. Consistency: Uniform data across databases or time periods. Completeness: No missing data points. Timeliness: Data availability when needed. Relevance: Data directly applicable to the problem at hand. Concepts of Data and Information Data: Raw facts, numbers, or symbols without context (e.g., CO levels, temperature readings). Information: Processed, contextualized data that has meaning (e.g., CO levels are rising compared to last year). From Data to Information: Collection: Gathering raw data through surveys, sensors, etc. Processing: Organizing, cleaning, and transforming data. Interpretation: Analyzing data to produce actionable insights. The Role of Data in Environmental Solutions Monitoring: Data from sensors (air, water, soil) helps track environmental changes. Modeling: Simulations of ecosystems, pollution, and climate depend on high-quality data. Evaluation: Success of environmental policies or conservation efforts can be assessed through reliable data. Public Awareness: Data-driven information (e.g., pollution levels) educates and mobilizes public action. Definitions Data Quality: The foundation for accurate decision- making and effective solutions. Information: Turning raw data into insights that guide environmental policies. Statistical Methods: Essential tools for analyzing environmental data and predicting outcomes. System Analysis: Helps understand and solve complex environmental issues by modeling interconnected factors. Introduction to Statistical Methods Descriptive Statistics: Summarizes data (mean, median, mode, standard deviation). Inferential Statistics: Draws conclusions about a population based on sample data (hypothesis testing, regression). Statistical Models: Regression Analysis: Predict environmental trends (e.g., temperature rise). Time Series Analysis: Study changes in environmental data over time (e.g., annual CO₂ emissions). Multivariate Analysis: Analyze multiple factors impacting ecosystems (e.g., pollution, biodiversity). Example: Using Statistics to Solve Environmental Issues Case Study: Predicting Air Quality Index (AQI) Data Collection: Air pollution levels (PM2.5, NO2, CO) collected from sensors. Regression Analysis: Analyzing how traffic patterns and weather conditions impact AQI. Outcome: Predictive models help create early warning systems for smog in urban areas. Structure of Environmental Data Systems Data Sources: Sensors: Air, water, and soil quality sensors. Satellite Data: Remote sensing for tracking deforestation, ice melt, etc. Citizen Science: Public contributions via apps and surveys. Data Processing: Collection, cleaning, integration of data from different sources. Database Systems: Structured storage (e.g., relational databases) ensuring data availability and security. Quick reminders What is the difference between the mode and the median Why system analysis is important for decision makers Why systems analysis matters? What is the difference between monitoring and modelling System Analysis Principles Definition: System analysis involves examining how components of a system interact to achieve a goal. Principles: Holistic View: Understand the system (e.g., an ecosystem) as a whole, not just its parts. Interconnectedness: Recognize the relationships between system elements (e.g., how pollution in water affects biodiversity). Feedback Loops: Identify processes that amplify or dampen system behaviors (e.g., climate feedback mechanisms). Slide 10: Applying System Analysis to Environmental Problems Case Study: Water Resource Management Problem: Water scarcity due to overuse and climate change. System Components: Water usage (agriculture, industry), rainfall patterns, water quality. Analysis: Modeling demand and supply, identifying critical thresholds, and designing interventions. Outcome: Improved water allocation policies, sustainable irrigation practices, and conservation efforts. Example: Solving Environmental Problems with Systems Thinking Problem: Urban Heat Island Effect Data Sources: Temperature data, satellite imagery, vegetation indices. System Components: Urban infrastructure, green spaces, energy usage. Analysis: Identifying key factors contributing to heat accumulation. Solution: Urban planning strategies (e.g., increasing green spaces) to reduce heat and improve air quality. Biostatistics It is the science which deals with development and application of the most appropriate methods for  Collection of data.  Presentation of the collected data.  Analysis and interpretation of the results.  Making decisions on the basis of such analysis What are the sources of data? Sources of data Sources of data Records Surveys Experiments Sources of data Records Surveys Experiments Comprehensive Sample Statistics as a Tool in Scientific Research Types of Research Questions Descriptive (What does X look like?) Correlational (Is there an association between X and Y? As X increases, what does Y do?) Experimental (Do changes in X cause changes in Y?) Different statistical procedures allow us to answer the different kinds of research questions Statistics as a Tool in Scientific Research  Start with the science and use statistics as a tool to answer the research question First formulate a research question How often does this happen? (Frequency) Did all plants/people/chemicals act the same? (Replica) What happens if : – I add more sunlight? – pour in more water? – Add more fertilizers? – Capture animals (Testing) Types of data Constant Variables Statistics as a Tool in Scientific Research Can collect data Can use already collected data (database) research question: what would be interesting to know? What do they want to find out? What do they expect? For example, what research questions might you ask from the survey? Descriptive, correlational, experimental Types of Data: Measurement Scales Categorical: male/female blood type (A, B, AB, O) Stage 1, Stage 2, Stage 3 melanoma Numerical: weight number of white blood cells Number of students Types of Data: Measurement Scales Categorical: Nominal (name/label) Ordinal (rank order) Numerical: Interval (equal intervals) Ratio (equal intervals and absolute zero) Types of Data: Measurement Scales Nominal: numbers are arbitrary; 1= male, 2 = female Ordinal: numbers have order (i.e., more or less) but you do not know how much more or less; 1st place runner was faster but you do not know how much faster than 2nd place runner Interval: numbers have order and equal intervals so you know how much more or less; A temperature of 102 is 2 points higher than one of 100 Ratio: same as interval but because there is an absolute zero you can talk meaningfully about twice as much and half as much; Weighing 200 pounds is twice as heavy as 100 pounds Types of variables Quantitative variables Qualitative variables Quantitative Qualitative Continuos nominal 1, 1.01, 1.02 , e.g. names etc Qualitative Quantitative Ordinal descrete e.g. 1st, 2nd 1,2,3 etc Methods of presentation of data  Numerical presentation  Graphical presentation  Mathematical presentation Types of Statistical Procedures Descriptive: Organize and summarize data Inferential: Draw inferences about the relations between variables; use samples to generalize to population Descriptive Statistics The first step is ALWAYS getting to know your data  Summarize and visualize your data It is a big mistake to just throw numbers into the computer and look at the output of a statistical test without any idea what those numbers are trying to tell you or without checking if the assumptions for the test are met. Descriptive Statistics Numerical Summaries: Frequencies Contingency tables Measures of central tendency Measures of variability Representing numerical summaries in tables Graphical Summaries: Bar graphs or Pie graphs Histograms Scatterplots Time series plot Summarizing and Reporting Categorical Data Frequency = number of times each score occurs in a set of data Relative Frequency = percent or proportion of times each score occurs in a set of data 1- Numerical presentation Tabular presentation (simple – complex) Simple frequency distribution Table (S.F.D.T.) Title Name of variable Frequency % (Units of variable) - - Categories - Total Summarizing and Reporting Categorical Data Frequency = number of times each score occurs in a set of data Relative Frequency = percent or proportion of times each score occurs in a set of data Table (I): Distribution of 50 patients at the surgical department of Alexandria hospital in May 2023 according to their ABO blood groups Blood Frequency % group A 12 24 B 18 36 AB 5 10 O 15 30 Total 50 100 Table (II): Distribution of 50 patients at the surgical department of Alexandria hospital in May 2023 according to their age Age Frequency % (years) 20-

Use Quizgecko on...
Browser
Browser