Statistical Modeling and Machine Learning Intro

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

The scientific method is a systematic way of solving a problem. Which stage in the scientific method uses background knowledge to provide a temporary explanation to a problem?

  • hypothesis (correct)
  • conclusion
  • independent variable
  • problem

Temperature is a measure of the average kinetic energy of the particles in a matter. What is the standard unit of measurement for temperature?

  • calories
  • Kelvin (correct)
  • Celsius
  • Fahrenheit

How many milligrams are there in 5.78 decigrams?

  • 578 (correct)
  • 57.8
  • 0.0578
  • 0.578

Significant figures are important part of scientific and mathematical calculations because they deal with the accuracy and precision of numbers. How many significant figures are there in 507.000 m/s?

<p>6 (B)</p>
Signup and view all the answers

What is the standard form of $6.7 \times 10^{-5}$?

<p>0.000067 (A)</p>
Signup and view all the answers

What do you call the branch of Science that studies the Earth's interior and composition?

<p>Geology (A)</p>
Signup and view all the answers

Gas-richcd magma reached Pinatubo's surface on June 15, 1991. The volcano exploded in a violent eruption that ejected more than 1 cubic mile of material. What type of rock can possibly be the classification of the debris from the last eruption of Mt. Pinatubo classified?

<p>Igneous (D)</p>
Signup and view all the answers

An earthquake occurred in a city, with a reported strength of 5.7 magnitude. Which instrument is used to measure such earthquake movement?

<p>Seismograph (D)</p>
Signup and view all the answers

The atmosphere is an important part of what makes Earth liveable. In which layer of the atmosphere do weather disturbances probably occur?

<p>troposphere (D)</p>
Signup and view all the answers

DOST researchers are planning to set up a location that will measure the relative humidity of rural regions. What instrument should they use?

<p>hygrometer (B)</p>
Signup and view all the answers

Energy is broadly classified into two main groups: renewable and non-renewable. Which is not an example of non-renewable energy?

<p>wind (B)</p>
Signup and view all the answers

An eclipse occurs when an object or any celestial body hinders the light of another illuminating object or celestial body. What do you call the type of eclipse when the Moon is in between the Earth and Sun?

<p>solar (C)</p>
Signup and view all the answers

Which is not influenced by the Earth's revolution?

<p>occurrence of day and night (A)</p>
Signup and view all the answers

What do you call the fragment of a comet or an asteroid that has entered the Earth's atmosphere?

<p>meteor (B)</p>
Signup and view all the answers

Many years ago, people liked to make up stories about constellations. These constellations are groups of stars that form a particular shape in the sky and have been given a name. The following analogies about constellations are correct except _____.

<p>Cygnus : The Dove (D)</p>
Signup and view all the answers

A plant cell is a eukaryotic cell that shares some characteristics with that of an animal cell. Which organelle is not included in a plant cell?

<p>centriole (D)</p>
Signup and view all the answers

Muscle contraction is the generation of tension in muscle tissue, often resulting in a lengthening or shortening of muscles. Which organelle is needed in muscle contraction?

<p>mitochondrion (B)</p>
Signup and view all the answers

Which cell transport explains the swelling of potato cells submerged in distilled water?

<p>hypoosmosis (C)</p>
Signup and view all the answers

A liver cell sample is obtained and mounted onto a microscope. Chromosomes are beginning to uncoil and the cytoplasm is starting to divide. Which stage of cell division is most probably described?

<p>telophase (A)</p>
Signup and view all the answers

Siblings, though they come from same parents, are not identical to each other. Which stage of cell division is mainly responsible for genetic variability?

<p>prophase I (A)</p>
Signup and view all the answers

Metabolism refers to the chemical reaction which occurs among living organisms to create and use energy needed to maintain life. It has two types, anabolism and catabolism which create and use energy, respectively. The following are examples of anabolic process except _____.

<p>cell respiration (C)</p>
Signup and view all the answers

Flashcards

What is a hypothesis?

A temporary explanation to a problem based on background knowledge.

Standard unit of temperature

Kelvin (K) is the standard unit for measuring temperature, representing absolute thermodynamic temperature.

Milligrams in 5.78 decigrams

There are 578 milligrams in 5.78 decigrams.

Significant figures in 507.000

In 507.000 m/s, there are six significant figures. Significant figures include all non-zero digits, zeros between non-zero digits, and trailing zeros after a decimal point.

Signup and view all the flashcards

Standard form of 6.7 x 10⁻⁵

  1. 7×10⁻⁵ in standard form is 0.000067.
Signup and view all the flashcards

Study of Earth's interior

Geology is the science that studies the Earth's interior and composition.

Signup and view all the flashcards

Classification of Mt. Pinatubo debris

The volcanic debris from the eruption of Mt. Pinatubo is classified as Igneous rock.

Signup and view all the flashcards

Measuring earthquake movement

A seismograph is used to measure earthquake movement.

Signup and view all the flashcards

Weather disturbances layer

Weather disturbances probably occur in the troposphere.

Signup and view all the flashcards

Instrument to measure humidity

Hygrometer is used to measure the relative humidity of rural regions.

Signup and view all the flashcards

Renewable energy

Wind is an example of renewable energy.

Signup and view all the flashcards

Type of eclipse in earth and suns position

A solar eclipse occurs when the Moon is in between the Earth and Sun.

Signup and view all the flashcards

Not influenced by Earth's revolution

Occurrence of day and night is NOT influenced by the Earth's revolution.

Signup and view all the flashcards

Fragment entering Earth's atmosphere

Meteor is the fragment of a comet or an asteroid that has entered the Earth's atmosphere.

Signup and view all the flashcards

Incorrect constellation analogy

Cygnus : The Dove is NOT a correct constellation analogy.

Signup and view all the flashcards

Organelle not in a plant cell

Centriole is not included in a plant cell

Signup and view all the flashcards

Organelle for muscle contraction

Mitochondrion is needed in muscle contraction.

Signup and view all the flashcards

Cell transport for swelling potato cells

Hyperosmosis explains the swelling of potato cells submerged in distilled water

Signup and view all the flashcards

Cell division stage

Prophase is the stage of cell division when chromosomes are beginning to uncoil and the cytoplasm is starting to divide.

Signup and view all the flashcards

Genetic variability stage

Prophase I of meiosis is mainly responsible for genetic variability.

Signup and view all the flashcards

Study Notes

Introduction to Statistical Modeling

  • Statistical modeling uses mathematical equations and probability distributions to represent complex systems.
  • Its models contain parameters estimated from data, which enables predictions and inferences.
  • It provides a framework for understanding uncertainty and informs data-driven decisions.

Introduction to Machine Learning

  • Machine learning involves algorithms that learn patterns from data without explicit programming.
  • Its primary focus is on prediction and automation.
  • It often utilizes complex models and large datasets.

Overlap Between Statistical Modeling and Machine Learning

  • Statistical modeling and machine learning are closely related with overlapping techniques and applications.
  • Statistical modeling gives a theoretical basis for machine learning algorithms.
  • Machine learning provides tools for creating and assessing statistical models.

Examples of Statistical Modeling

  • Linear Regression models the relationship between dependent and independent variables, represented as $y = \beta_0 + \beta_1 x + \epsilon$.
  • Logistic Regression models the probability of a binary outcome, represented as $P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x)}}$.
  • Time series analysis models data collected over time.
  • Analysis of variance (ANOVA) compares means across different groups.

Examples of Machine Learning

  • Classification assigns data points to categories.
  • Regression predicts a continuous outcome.
  • Clustering groups similar data points together.
  • Dimensionality reduction minimizes the number of variables while retaining key information.

Types of Machine Learning

  • Supervised learning involves training a model on labeled data for predictions on new data.
  • Unsupervised learning involves discovering patterns in unlabelled data.
  • Reinforcement learning involves training an agent to maximize a reward in an environment.

Model Evaluation (Supervised Learning)

  • Training data trains the model.
  • Validation data fine-tunes model hyperparameters.
  • Test data assesses the final model performance.

Common Evaluation Metrics

  • Regression metrics include Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-squared, with formulas:
    • $MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2$
    • $RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2}$
    • $R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y_i})^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}$
  • Classification metrics include Accuracy, Precision, Recall, F1-score, and AUC-ROC.

Bias-Variance Tradeoff

  • Bias is error from inaccurate assumptions, causing underfitting by missing relations between features and targets.
  • Variance is error from training set fluctuations, causing overfitting by modelling noise, not intended outputs.
  • A trade-off exists between bias and variance, that is a balance between accuracy and consistency..
  • Complex models usually have low bias and high variance.
  • Simple models usually have high bias and low variance.

Regularization Techniques

  • Regularization prevents overfitting by adding a penalty to the loss function.
  • L1 regularization (LASSO) and L2 regularization (Ridge regression) are examples.

Model Selection

  • Model selection involves choosing the best model from a set of candidates.
  • It uses validation set performance or cross-validation.

Cross-Validation

  • Cross-validation evaluates model performance by training/testing on different data fold combinations.
  • K-fold cross-validation divides data into k folds, training on k-1 folds and testing on the remaining one. This is repeated k times.
  • Leave-one-out cross-validation is a special case where k equals the number of data points.

Statistical Modeling vs, Machine Learning

Feature Statistical Modeling Machine Learning
Goal Inference: understanding relationships between variables and testing hypotheses. Prediction: building accurate models that can generalize to new data.
Focus Model interpretability and statistical significance. Predictive accuracy and computational scalability.
Model complexity Simpler, more interpretable models. Complex, black-box models.
Data size Smaller datasets, often collected through carefully designed experiments. Large datasets, often collected from observational studies or generated automatically.
Examples Linear regression, logistic regression, time series analysis, ANOVA. Neural networks, support vector machines, decision trees, random forests.
Regularization Can be important for variable selection and model improvement Highly important to avoid overfitting
Cross-validation Used for model assessment and tuning Is essential and considered a standard practice to prevent overfitting
Inference Formal inference and hypothesis testing, Confidence Intervals, P-values Interpretability is less of a focus, although methods like SHAP values can provide some insight

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

More Like This

Use Quizgecko on...
Browser
Browser