Podcast
Questions and Answers
Which of the following is the MOST accurate description of the Internet?
Which of the following is the MOST accurate description of the Internet?
- A physical infrastructure composed of wires and routers managed by a single entity.
- A network of private, public, academic, business and government entities that work independently.
- A vast collection of websites and online applications owned by various corporations.
- A global system of interconnected computer networks utilizing standardized protocols. (correct)
Which of the following is NOT a primary function of a firewall?
Which of the following is NOT a primary function of a firewall?
- Filtering network packets.
- Providing network address translation (NAT).
- Monitoring network traffic.
- Directing traffic efficiently between networks. (correct)
A router operates at which layer of the OSI model?
A router operates at which layer of the OSI model?
- Network Layer (Layer 3) (correct)
- Transport Layer (Layer 4)
- Data Link Layer (Layer 2)
- Physical Layer (Layer 1)
Which of the following is the primary function of a router?
Which of the following is the primary function of a router?
Which of the following accurately describes the role of a web browser?
Which of the following accurately describes the role of a web browser?
Which of the following is the MOST accurate description of a search engine's function?
Which of the following is the MOST accurate description of a search engine's function?
Which of the following HTML elements defines the main content of an HTML Document.
Which of the following HTML elements defines the main content of an HTML Document.
Which of the following is the main difference between HTTP and HTTPS?
Which of the following is the main difference between HTTP and HTTPS?
Given the IP address 192.168.1.100
, to what does this numerical identifier correspond?
Given the IP address 192.168.1.100
, to what does this numerical identifier correspond?
What is the primary function of DNS (Domain Name System)?
What is the primary function of DNS (Domain Name System)?
Which HTML tag is typically used to define the document type?
Which HTML tag is typically used to define the document type?
Why are HTML attributes used within HTML tags?
Why are HTML attributes used within HTML tags?
A CNAME
record within DNS is used for what purpose?
A CNAME
record within DNS is used for what purpose?
What is the purpose of an AAAA
record in DNS?
What is the purpose of an AAAA
record in DNS?
What is the primary function of HTML tags?
What is the primary function of HTML tags?
Which type of firewall inspects traffic and controls incoming and outgoing network traffic?
Which type of firewall inspects traffic and controls incoming and outgoing network traffic?
Which of the following is NOT typically considered a core component of a web browser's user interface?
Which of the following is NOT typically considered a core component of a web browser's user interface?
Which of the following are examples of general search engines?
Which of the following are examples of general search engines?
Which of the following are examples of browsers?
Which of the following are examples of browsers?
Which of the following are types of firewalls?
Which of the following are types of firewalls?
Flashcards
The Internet
The Internet
A global system of interconnected computer networks that communicate using standardized protocols.
IP Address
IP Address
A unique identifier assigned to each device connected to a network, enabling data routing.
Browser
Browser
A software application used to access, retrieve, and view content on the Internet.
Search Engine
Search Engine
Signup and view all the flashcards
HTML
HTML
Signup and view all the flashcards
HTTP and HTTPS
HTTP and HTTPS
Signup and view all the flashcards
DNS (Domain Name System)
DNS (Domain Name System)
Signup and view all the flashcards
Firewall
Firewall
Signup and view all the flashcards
Router
Router
Signup and view all the flashcards
A Record (Address Record)
A Record (Address Record)
Signup and view all the flashcards
CNAME Record
CNAME Record
Signup and view all the flashcards
AAAA Record
AAAA Record
Signup and view all the flashcards
HTML Attributes
HTML Attributes
Signup and view all the flashcards
Signup and view all the flashcards
Signup and view all the flashcards
Signup and view all the flashcards
Study Notes
Bayesian vs. Frequentist Approaches
- Frequentist goal: Find the single best estimate for the parameter $\theta$.
- Bayesian goal: Determine a distribution $p(\theta)$ for the parameter.
- Frequentist $\theta$: Considered a fixed, unknown constant.
- Bayesian $\theta$: Considered a random variable.
- Frequentist prior knowledge: Included by choosing a good hypothesis class, regularization or early stopping.
- Bayesian prior knowledge: Explicitly encoded in a prior distribution $p(\theta)$.
- Frequentist uncertainty: Measured by confidence intervals and p-values.
- Bayesian uncertainty: Encoded directly by the posterior distribution $p(\theta|D)$.
- In Frequentist predictions, the "best" $\theta^*$ is plugged into the model.
- In Bayesian predictions, all possible values of $\theta$ are integrated over, weighted by their posterior probability: $p(x|D) = \int p(x|\theta) p(\theta|D) d\theta$.
- Frequentist methods have a risk of overfitting if predictions are directly optimized.
- Bayesian methods theoretically never overfit if the integrals are properly computed.
- Frequentist approach computation cost: Usually cheaper.
- Bayesian approach computation cost: Can be very expensive as the $p(\theta|D)$ (inference) and $p(x|D)$ (prediction) need to be computed.
- The Bayesian approach is useful when there's a good prior, data is scarce or uncertainty quantification is needed.
- The Bayesian approach may be unhelpful if there's no prior, or a bad one; also when data is abundant or only the "best" answer is needed.
Bayesian Linear Regression
- Model equation: $y = \theta^T x + \epsilon$, with $\epsilon \sim \mathcal{N}(0, \sigma^2)$ defining the error term.
Frequentist Approach to Linear Regression
- Find the "best" $\theta$ by minimizing the sum of squared errors.
- The equation is: $\theta^* = \arg \min_\theta \sum_{i=1}^N (y_i - \theta^T x_i)^2 = (X^T X)^{-1} X^T y$,
- $X$ is a matrix of input data $x_i$
- $y$ is a vector of observed values $y_i$. $X = \begin{bmatrix} - & x_1^T & - \ - & x_2^T & - \ & \vdots & \ - & x_N^T & - \end{bmatrix}$, $ y = \begin{bmatrix} y_1 \ y_2 \ \vdots \ y_N \end{bmatrix}$.
- A single point estimate $\theta^*$ results, which may be insufficient for quantifying uncertainty.
Bayesian Approach to Linear Regression
- This determines a distribution over $\theta$ rather than a single best estimate.
- Prior: Specify a prior distribution $p(\theta)$ to encode beliefs about $\theta$.
- Likelihood: Specify the likelihood $p(y|X, \theta)$, representing the probability of observing $y$ given $X$ and $\theta$.
- Posterior: Compute the posterior distribution $p(\theta|X, y)$ using Bayes’ rule.
- The equation is: $p(\theta|X, y) = \frac{p(y|X, \theta) p(\theta)}{p(y|X)} = \frac{p(y|X, \theta) p(\theta)}{\int p(y|X, \theta) p(\theta) d\theta}$.
- The posterior distribution encodes beliefs about $\theta$ after observing data.
Gaussian Prior
- For Bayesian Linear Regression, a Gaussian prior is commonly chosen.
- The equation is: $p(\theta) = \mathcal{N}(\mu_0, \Sigma_0)$.
- $\mu_0$ is the prior mean
- $\Sigma_0$ is the prior covariance.
- The prior encodes beliefs about plausible values of $\theta$ before seeing any data.
Computing the Posterior Distribution
- With a Gaussian prior and a Gaussian likelihood, the posterior is also Gaussian
- The equation is: $p(\theta|X, y) = \mathcal{N}(\mu_N, \Sigma_N)$.
- Where $\Sigma_N = (\Sigma_0^{-1} + \frac{1}{\sigma^2} X^T X)^{-1}$
- Where $\mu_N = \Sigma_N (\Sigma_0^{-1} \mu_0 + \frac{1}{\sigma^2} X^T y)$.
Real-valued Function of n Variables
- This is a function whose domain is a subset of $\mathbb{R}^n$ and whose range is a subset of $\mathbb{R}$.
- The function is $f(x, y)$.
Examples of Real-valued Function of n Variables
- Area of a rectangle: $A(l, w) = lw$
- Volume of a cylinder: $V(r, h) = \pi r^2 h$
- Ideal Gas Law: $P(n, V, T) = \frac{nRT}{V}$, where $R$ is the ideal gas constant.
Level Sets (Contour Plots)
- Level sets of a function $f(x, y)$ are curves defined by $f(x, y) = c$, where $c$ is a constant in the range of $f$.
- These curves are also called contours or level curves.
Graph of a Function of Two Variables
- This is the set of all points $(x, y, z)$ in $\mathbb{R}^3$ such that $z = f(x, y)$, forming a surface in three-dimensional space.
Linear functions
- A linear function of two variables has the form $f(x, y) = ax + by + c$, where $a$, $b$, and $c$ are constants.
- The graph of a linear function is a plane.
Limits and Continuity
- Let $f(x, y)$ be a function defined on a disk around $(x_0, y_0)$, except possibly at $(x_0, y_0)$ itself.
- The limit of $f(x, y)$ as $(x, y)$ approaches $(x_0, y_0)$ is $L$.
- Written as $\lim_{(x,y) \to (x_0, y_0)} f(x, y) = L$ if for every $\epsilon > 0$, there exists a $\delta > 0$ such that if $0 < \sqrt{(x - x_0)^2 + (y - y_0)^2} < \delta$, then $|f(x, y) - L| < \epsilon$.
Definition of Continuity
- A function $f(x, y)$ is continuous at $(x_0, y_0)$.
- $f(x_0, y_0)$ is defined.
- $\lim_{(x,y) \to (x_0, y_0)} f(x, y)$ exists.
- $\lim_{(x,y) \to (x_0, y_0)} f(x, y) = f(x_0, y_0)$.
- $f(x, y)$ is continuous on a set $S$ if it is continuous at every point in $S$.
Partial Derivatives Definition
- The partial derivative of $f$ with respect to $x$ at $(x_0, y_0)$ is defined such that
- $\frac{\partial f}{\partial x}(x_0, y_0) = \lim_{h \to 0} \frac{f(x_0 + h, y_0) - f(x_0, y_0)}{h}$
- The partial derivative of $f$ with respect to $y$ at $(x_0, y_0)$ is defined such that
- $\frac{\partial f}{\partial y}(x_0, y_0) = \lim_{h \to 0} \frac{f(x_0, y_0 + h) - f(x_0, y_0)}{h}$
Partial Derivatives Notation
- $\frac{\partial f}{\partial x} = f_x$
- $\frac{\partial f}{\partial y} = f_y$
Higher Order Partial Derivatives
- $f_{xx} = \frac{\partial^2 f}{\partial x^2} = \frac{\partial}{\partial x} \left( \frac{\partial f}{\partial x} \right)$
- $f_{yy} = \frac{\partial^2 f}{\partial y^2} = \frac{\partial}{\partial y} \left( \frac{\partial f}{\partial y} \right)$
- $f_{xy} = \frac{\partial^2 f}{\partial y \partial x} = \frac{\partial}{\partial y} \left( \frac{\partial f}{\partial x} \right)$
- $f_{yx} = \frac{\partial^2 f}{\partial x \partial y} = \frac{\partial}{\partial x} \left( \frac{\partial f}{\partial y} \right)$
Clairaut's Theorem
- If $f_{xy}$ and $f_{yx}$ are continuous on an open set containing $(x_0, y_0)$, then $f_{xy}(x_0, y_0) = f_{yx}(x_0, y_0)$.
Vektorit tasossa ja avaruudessa
- Määritelmä: Reaalilukukentän $\mathbb{R}$ vektoriavaruus $\mathbb{R}^n = {(x_1, x_2,..., x_n) | x_i \in \mathbb{R} }$
- Vektorien yhteenlasku: $\bar{x} + \bar{y} = (x_1 + y_1, x_2 + y_2,..., x_n + y_n)$
- Skalaarikertolasku: $\lambda\bar{x} = (\lambda x_1, \lambda x_2,..., \lambda x_n)$
- Euklidinen normi: $||\bar{x}|| = \sqrt{x_1^2 + x_2^2 +... + x_n^2}$
- Huomautus: $||\bar{x}|| \geq 0$ kaikilla $\bar{x} \in \mathbb{R}^n$ ja $||\bar{x}|| = 0$ ainoastaan kun $\bar{x} = \bar{0}$. Lisäksi $||\lambda\bar{x}|| = |\lambda|||\bar{x}||$.
Pistetulo
- Määritelmä: $\bar{x}, \bar{y} \in \mathbb{R}^n$. Tällöin $\bar{x} \cdot \bar{y} = x_1y_1 + x_2y_2 +... + x_ny_n = \sum_{i=1}^{n} x_iy_i$
- Huomautus:
- Pistetulo on skalaari
- $\bar{x} \cdot \bar{x} = ||\bar{x}||^2$
- Lause: $\bar{x}, \bar{y} \in \mathbb{R}^n$. Tällöin $\bar{x} \cdot \bar{y} = ||\bar{x}|| \cdot ||\bar{y}|| cos(\theta)$, missä $\theta$ on $\bar{x}:n$ ja $\bar{y}:n$ välinen kulma.
- Seuraus: (Cauchy-Schwarzin epäyhtälö): $|\bar{x} \cdot \bar{y}| \leq ||\bar{x}|| \cdot ||\bar{y}||$
- Määritelmä: $\bar{x}$ ja $\bar{y}$ ovat kohtisuorassa, jos $\bar{x} \cdot \bar{y} = 0$
Vektorien lineaarinen riippuvuus ja riippumattomuus
- Määritelmä: Vektorit $\bar{x_1}, \bar{x_2},..., \bar{x_k} \in \mathbb{R}^n$ ovat lineaarisesti riippuvia, jos on olemassa skalaarit $\lambda_1, \lambda_2,..., \lambda_k \in \mathbb{R}$, jotka eivät kaikki ole nollia, siten, että $\lambda_1\bar{x_1} + \lambda_2\bar{x_2} +... + \lambda_k\bar{x_k} = \bar{0}$. Jos vektorit eivät ole lineaarisesti riippuvia, ne ovat lineaarisesti riippumattomia.
- Huomautus: Vektorit $\bar{x_1}, \bar{x_2},..., \bar{x_k} \in \mathbb{R}^n$ ovat lineaarisesti riippumattomia, jos $\lambda_1\bar{x_1} + \lambda_2\bar{x_2} +... + \lambda_k\bar{x_k} = \bar{0}$ ainoastaan kun $\lambda_1 = \lambda_2 =... = \lambda_k = 0$.
- Lause: Jos vektorit $\bar{x_1}, \bar{x_2},..., \bar{x_k} \in \mathbb{R}^n$ ovat lineaarisesti riippumattomia, niin $k \leq n$.
Vektorijoukkojen virittäminen
- Määritelmä: Vektoriavaruuden $V$ vektorijoukko ${\bar{x_1}, \bar{x_2},..., \bar{x_k}}$ virittää $V:n$, jos jokainen vektori $\bar{x} \in V$ voidaan kirjoittaa lineaarikombinaationa $\bar{x} = \lambda_1\bar{x_1} + \lambda_2\bar{x_2} +... + \lambda_k\bar{x_k}$, missä $\lambda_1, \lambda_2,..., \lambda_k \in \mathbb{R}$.
- Määritelmä: Vektoriavaruuden $V$ kanta on joukko lineaarisesti riippumattomia vektoreita, jotka virittävät $V:n$.
- Huomautus: Vektoriavaruudella $V$ on useita kantoja, mutta kaikissa kannoissa on sama määrä vektoreita. Tämä lukumäärä on $V:n$ dimensio, $dim(V)$.
- Esimerkki: $\mathbb{R}^n:n$ standardikanta on ${\bar{e_1}, \bar{e_2},..., \bar{e_n}}$, missä $\bar{e_i} = (0, 0,..., 1,..., 0)$. Tällöin $dim(\mathbb{R}^n) = n$.
Motivation for Projections
- Converting 3D world to 2D display requires projections.
Taxonomy of Projections
- Orthographic Projection: Parallel lines remain parallel. Good for technical drawings, as it preserves lengths and angles. Not realistic.
- Perspective Projection: Parallel lines converge (mimics the human eye/camera). Gives a more realistic view.
Orthographic Projection Definition
- Also called "parallel projection".
- Project along a direction (DOP: direction of projection) onto the projection plane.
- A general example has DOP as z-axis and projection plane as z=0, hence $(x, y, z) \rightarrow (x, y)$.
Orthographic Projection Variants
- Multi-view orthographic projection: Projection plane is parallel to one of the principal planes (xy, yz, zx). Also called "axonometric projection".
- Oblique projection: Projection plane is not parallel to any of the principal planes.
Perspective Projection Definition
- Project points to the center of projection (COP).
- Projection plane is where the image forms.
- Also called a "central projection."
Perspective Projection Math
- If COP is at origin and projection plane is z=d. A point (x,y,z) projects to (x',y',d).
- The points (x,y,z), (0,0,0), and (x',y',d) are collinear.
- $\begin{bmatrix} x'\ y'\ d \end{bmatrix} = \lambda \begin{bmatrix} x\ y\ z \end{bmatrix}$ and $\frac{x'}{x} = \frac{y'}{y} = \frac{d}{z}$.
- Therefore, $(x', y') = (\frac{dx}{z}, \frac{dy}{z})$.
- Note that z cannot be zero.
Homogeneous Coordinates for Perspective Projection
- Perspective projection becomes a linear operation in homogeneous coordinates.
- $(x, y, z, 1) \rightarrow (\frac{dx}{z}, \frac{dy}{z}, d, 1) \sim (dx, dy, dz, z)$.
- $\begin{bmatrix} x'\ y'\ z'\ w' \end{bmatrix} = \begin{bmatrix} d & 0 & 0 & 0\ 0 & d & 0 & 0\ 0 & 0 & d & 0\ 0 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} x\ y\ z\ 1 \end{bmatrix}$
- $(x', y', z') = (\frac{dx}{z}, \frac{dy}{z}, d)$
Field of View
- The amount of the scene that is visible on the image plane.
- Determined by the camera's lens and the size of the image sensor.
Vertical Field of View Formula
- Vertical Field of View $\theta = 2arctan(\frac{h}{2d})$
h
is height of the image sensord
is the distance from the lens to the image sensor.
Vanishing Points Definition
- The point where parallel lines appear to converge in a perspective projection.
- Each set of parallel lines has its own vanishing point.
- The vanishing point is the projection of the point at infinity in the direction of the lines.
Number of Vanishing Points
- 1-point: one vanishing point
- 2-point perspective: two vanishing points
- 3-point perspective: three vanishing points.
Example
- A picture shows an example of 3-point perspective.
- There are three vanishing points, each corresponding to one of the three principal axes.
EstadÃstica Descriptiva
- La estadÃstica descriptiva se ocupa de describir y resumir conjuntos de datos.
- Proporciona métodos para organizar, presentar y analizar datos de manera que se puedan comprender y comunicar de manera efectiva.
Métodos de la estadÃstica descriptiva
- Tablas de frecuencia: Resumen la frecuencia con la que aparecen diferentes valores en un conjunto de datos.
- Gráficos: Representan visualmente los datos, lo que facilita la identificación de patrones y tendencias.
- Medidas de tendencia central: Indican el valor tÃpico de un conjunto de datos (media, mediana, moda).
- Medidas de dispersión: Miden la variabilidad de los datos (rango, desviación estándar, varianza).
Tipos de datos
- Datos cualitativos: Describen cualidades o caracterÃsticas no numéricas (color, género, opinión).
- Datos cuantitativos: Representan cantidades numéricas (edad, altura, peso).
- Discretos: Toman valores enteros (número de hijos, número de coches).
- Continuos: Pueden tomar cualquier valor dentro de un rango (temperatura, altura).
Tablas de frecuencia
- Una tabla de frecuencia muestra la distribución de los datos al listar los valores y su frecuencia (número de veces que aparece cada valor).
- Valor | Frecuencia, A | 10, B | 15, C | 8.
Gráficos
- Los gráficos son representaciones visuales de los datos que facilitan la comprensión y el análisis.
Tipos Comunes de Gráficos
- Histograma: Muestra la distribución de datos continuos.
- Gráfico de barras: Compara las frecuencias de diferentes categorÃas.
- Gráfico circular: Muestra la proporción de cada categorÃa en relación con el total.
- Diagrama de dispersión: Muestra la relación entre dos variables.
Medidas de tendencia central
- Las medidas de tendencia central indican el valor tÃpico de un conjunto de datos.
Media
- La media ($\mu$) es el promedio de los valores.
- $\mu = \frac{\sum_{i=1}^{n} x_i}{n}$.
Mediana
- La mediana es el valor central cuando los datos están ordenados.
Moda
- La moda es el valor que aparece con mayor frecuencia.
Medidas de dispersión
- Las medidas de dispersión indican la variabilidad de los datos.
Rango
- El rango es la diferencia entre el valor máximo y el valor mÃnimo.
Varianza
- La varianza ($\sigma^2$) mide la dispersión promedio de los datos alrededor de la media.
- $\sigma^2 = \frac{\sum_{i=1}^{n} (x_i - \mu)^2}{n}$.
Desviación estándar
- La desviación estándar ($\sigma$) es la raÃz cuadrada de la varianza.
- $\sigma = \sqrt{\sigma^2}$.
Ejemplo Completo
- Consideremos el siguiente conjunto de datos: 1, 2, 2, 3, 4, 5.
Tabla de frecuencia
- Valor | Frecuencia, 1 | 1, 2 | 2, 3 | 1, 4 | 1, 5 | 1.
Medidas de tendencia central
- Media: $\mu = \frac{1+2+2+3+4+5}{6} = 2.83$.
- Mediana: 2.5 (promedio de 2 y 3).
- Moda: 2.
Medidas de dispersión
- Rango: 5 - 1 = 4.
- Varianza: $\sigma^2 = \frac{(1-2.83)^2 + (2-2.83)^2 \cdot 2 + (3-2.83)^2 + (4-2.83)^2 + (5-2.83)^2}{6} = 2.22$.
- Desviación estándar: $\sigma = \sqrt{2.22} = 1.49$.
Conclusión Sobre EstadÃstica Descriptiva
- La estadÃstica descriptiva es una herramienta fundamental para resumir y comprender conjuntos de datos.
- Proporciona métodos para organizar, presentar y analizar datos de manera efectiva, facilitando la toma de decisiones y la comunicación de resultados.
Statistical Methods Learning Objectives
- Define and explain the terms Statistics and Biostatistics.
- Describe the uses of Statistics in Biology and Medicine.
- Define data and describe different types of data.
- Explain the methods of data collection.
- Explain data processing, presentation, and interpretation.
- Calculate and interpret common measures of central tendency and dispersion.
- Calculate and interpret probabilities and use them to make predictions.
- Describe different types of statistical tests and when to use them.
- Use computer software to perform basic statistical analyses.
- Interpret the results of statistical analyses and draw conclusions.
Statistical Methods Topics
- Introduction
- Types of Data
- Data Collection Methods
- Data Processing and Presentation
- Descriptive Statistics
- Probability
- Statistical Inference
- Statistical Tests
- Statistical Software
Recommended Books
- Wayne W. Daniel, Biostatistics: A Foundation for Analysis in the Health Sciences.
- Remington C.L. and Schork M.A., Statistics with Application to Biological and Health Sciences.
- Zar J.H., Biostatistical Analysis.
- Norman G.R. and Streiner D.L., Biostatistics: The Bare Essentials.
Definition of Regular Expressions
- Formal language to describe string patterns.
Uses of Regular Expressions
- Text search and replacement
- Compiler construction
- Data validation
Parts of Regular expressions
- Literals:
a
,b
,1
,2
, etc. - Meta-characters:
.
,*
,+
,?
,[]
,()
, etc. - Character classes:
[a-z]
,[0-9]
,\w
,\d
, etc. - Quantifiers:
*
,+
,?
,{n}
,{n,}
,{n,m}
, etc. - Anchors:
^
,$
. - Grouping:
()
. - Alternatives:
|
.
Meta-Characters in Detail
.
: Any character except newline. Example:a.b
matchesaab
,axb
.*
: 0 or more occurrences of the preceding character. Example:a*b
matchesb
,ab
,aab
,aaab
.+
: 1 or more occurrences of the preceding character. Example:a+b
matchesab
,aab
,aaab
but notb
.?
: 0 or 1 occurrence of the preceding character. Example:a?b
matchesb
,ab
but notaab
.[]
: Character class; one of the characters inside the bracket. Example:[abc]
matchesa
,b
, orc
.()
: Grouping; combines part of an expression. Example:(ab)+
matchesab
,abab
,ababab
.|
: Alternative; either the expression on the left or the expression on the right. Example:a|b
matchesa
orb
.^
:Anchor; expression must be at the beginning of the line/string. Example:^a
matchesabc
but notbac
.$
: Anchor; expression must be at the end of the string/line. Example:a$
matchesba
but notabc
.\
: Escape character, removes special meaning of meta-characters. Example:\.
matches.
(period character).
Character Classes in Detail
\d
: Digit (0-9).\D
: Non-digit.\w
: Word character (letters, digits, underscore).\W
: Non-word character.\s
: Whitespace (space, tab, newline).\S
: Non-whitespace.[a-z]
: All lowercase letters.[A-Z]
: All uppercase letters.[0-9]
: All digits.
Quantifiers in Detail
{n}
: Exactly n times.{n,}
: At least n times.{n,m}
: Between n and m times (inclusive).*?
: Zero or more, as few as possible (lazy).+?
: One or more, as few as possible (lazy).??
: Zero or one, as few as possible (lazy).
Example Regular Expressions
- E-mail address:
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
- Date (dd.mm.yyyy):
(0[1-9]|[12][0-9]|3[01])\.(0[1-9]|1[012])\.(19|20)\d\d
- HTML tag:
]+>
Use of Regular Expressions in Python
- Use the
re
module in Python.
import re
## Search for a pattern in a string
pattern = r"Hello World" # raw string, to avoid escaping backslashes
text = "Hello World, this is a test."
result = re.search(pattern, text)
if result:
print("Pattern found")
print(result.group(0)) # Prints the found text
else:
print("Pattern not found")
## Replace text
new_text = re.sub(pattern, "New World", text)
print(new_text)
## Split a string
parts = re.split(r"\s", text) # Splits at whitespace
print(parts)
## Find all occurrences of a pattern
pattern = r"\d+" # Finds one or more digits
text = "There are 12 apples and 3 pears."
results = re.findall(pattern, text)
print(results) # Prints a list of the digits
## Compile a regular expression for repeated use
pattern = re.compile(r"[a-z]+")
results = pattern.findall(text)
print(results) #Prints a list of lowercase strings
Python Regex Key points and cautions
- Regular expressions can be complex and hard to read.
- Test regex to ensure desired behavior.
- For complex tasks, consider specialized libraries.
- Be aware of performance, as complex regex can be slow.
Basic Random Experiment Definition
- An experiment whose outcome cannot be predicted with certainty.
Sample space definition
- The set of all possible outcomes of a random experiment.
Sample Space Example: Tossing Coin
- Experiment: Toss a coin.
- Sample Space:* $S = {H, T}$
Sample Space Example: Tossing Die
- Experiment: Toss a die.
- Sample Space:* $S = {1, 2, 3, 4, 5, 6}$
Sample Space Example: Twice Tossing Coin
- Experiment: Toss a coin twice.
- Sample Space:* $S = {HH, HT, TH, TT}$
Event Defined
- A subset of the sample space.
Event Examples
- Experiment: Toss a die.
- Sample Space: $S = {1, 2, 3, 4, 5, 6}$
- Event E: Observe an even number. $E = {2, 4, 6}$
- Experiment:* Toss a coin twice.
- Sample Space: $S = {HH, HT, TH, TT}$
- Event E: Observe at least one head. $E = {HH, HT, TH}$
Elementary Event Defined
- An event consisting of only one outcome.
Elementary Event Example
- Experiment: Toss a die.
- Sample Space: $S = {1, 2, 3, 4, 5, 6}$
- Elementary Events: ${1}, {2}, {3}, {4}, {5}, {6}$
Ways of Assigning Probability
- Classical Approach
- Relative Frequency Approach
- Subjective Probability
Classical Approach (Equally Likely Outcomes)
- If there are
n
equally likely outcomes, then the probability of each outcome is $\frac{1}{n}$. - If an event
E
containsk
outcomes, then: $P(E) = \frac{\text{Number of outcomes in E}}{\text{Total number of outcomes in S}} = \frac{k}{n}$
Classical Probability Approach Example
- Experiment: Toss a die.
- Sample Space: $S = {1, 2, 3, 4, 5, 6}$
- Event E: Observe an even number. $E = {2, 4, 6}$ $P(E) = \frac{3}{6} = \frac{1}{2}$
Relative Frequency Approach Probability Equation
$P(E) = \lim_{n \to \infty} \frac{\text{Number of times E occurs}}{n}$
- As the number of trials (
n
) increases, the relative frequency of the eventE
approaches the true probability ofE
.
Subjective Approach Probability
- Subjective approach is solely based on personal judgment, experience, or belief.
Axioms of Probability
- For any event $E$, $0 \leq P(E) \leq 1$.
- $P(S) = 1$ (where
S
is the sample space). - If $E_1, E_2, E_3,...$ are mutually exclusive events, then $P(E_1 \cup E_2 \cup E_3 \cup...) = \sum_{i=1}^{\infty} P(E_i)$
Basic Probability Rules
- Complement Rule:* $P(E^c) = 1 - P(E)$
- $P(\emptyset) = 0$ (where $\emptyset$ is the empty set).
- Addition Rule: $P(E_1 \cup E_2) = P(E_1) + P(E_2) - P(E_1 \cap E_2)$ If $E_1$ and $E_2$ are mutually exclusive, then $P(E_1 \cap E_2) = 0$, and $P(E_1 \cup E_2) = P(E_1) + P(E_2)$
Conditional Probability
- The probability of event
A
occurring given that eventB
has already occurred is denoted by $P(A|B)$. - This is defined as $P(A|B) = \frac{P(A \cap B)}{P(B)}$, provided $P(B) > 0$
- Similarly, $P(B|A) = \frac{P(B \cap A)}{P(A)}$, provided $P(A) > 0$
Independence
- Two events $A$ and $B$ are independent if the occurrence of one does not affect the probability of the other.
- $A$ and $B$ are independent if: $P(A|B) = P(A)$ or $P(B|A) = P(B)$ or $P(A \cap B) = P(A)P(B)$
- If $A$ and $B$ are independent, then $A^c$ and $B^c$ are also independent.
Bayes' Theorem
- Let $A_1, A_2,..., A_n$ be a set of mutually exclusive and exhaustive events.
- For any event $B$, $P(A_i|B) = \frac{P(B|A_i)P(A_i)}{\sum_{j=1}^{n} P(B|A_j)P(A_j)}$ This equation is useful for updating probabilities based on new evidence or information.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.