Data Wrangling with Pandas and Python

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

What does the term 'andr' refer to in the context of andrology?

  • Specialist
  • Reproduction
  • Female
  • Male (correct)

What is the primary focus of an andrologist?

  • Treating urinary infections
  • Managing sexually transmitted diseases
  • Treating conditions affecting male fertility and sexuality (correct)
  • Performing surgeries on the kidneys

What is the meaning of the suffix '-itis'?

  • Inflammation (correct)
  • Tumor
  • Surgical removal
  • Abnormal condition

What does the term 'balan' refer to?

<p>Glans penis (C)</p> Signup and view all the answers

What is phimosis?

<p>Narrowing of the foreskin opening (B)</p> Signup and view all the answers

What is another term for erectile dysfunction?

<p>Impotence (B)</p> Signup and view all the answers

What is male hypogonadism also known as?

<p>Testosterone deficiency (D)</p> Signup and view all the answers

What is Peyronie's disease characterized by?

<p>Penile curvature (D)</p> Signup and view all the answers

What does the prefix 'oligo/o' mean in the context of sperm count?

<p>Few (B)</p> Signup and view all the answers

What is azoospermia?

<p>Absence of sperm (B)</p> Signup and view all the answers

What does the term 'crypt' mean in cryptorchidism?

<p>Hidden (D)</p> Signup and view all the answers

What is cryptorchidism also known as?

<p>Undescended testicle (C)</p> Signup and view all the answers

What does the term 'hydro' in hydrocele relate to?

<p>Water (B)</p> Signup and view all the answers

What does androgen suppression therapy aim to do?

<p>Reduce levels of androgen (D)</p> Signup and view all the answers

Flashcards

Andrologist

A doctor focusing on the treatment of conditions affecting male fertility and sexuality.

Urologist

A physician who specializes in diagnosing and treating diseases and disorders of the genitourinary system of males and the urinary system of females.

Genitourinary

Relating to both the genital and urinary organs.

Venereologist

A doctor who specializes in the treatment of sexually transmitted diseases and infections.

Signup and view all the flashcards

Balanitis

An inflammation of the glans penis that is usually caused by poor hygiene in men who have not had the foreskin removed by circumcision.

Signup and view all the flashcards

Phimosis

Narrowing of the opening of the foreskin so it cannot be retracted (pulled back) to expose the glans penis.

Signup and view all the flashcards

Erectile dysfunction (ED)

The inability of the male to achieve or maintain a penile erection.

Signup and view all the flashcards

Andropause

Condition is marked by the decrease of the male hormone testosterone.

Signup and view all the flashcards

Cryptorchidism

A developmental defect seen in newborns in which one or both of the testicles have failed to descend into their normal position in the scrotum.

Signup and view all the flashcards

Epididymitis

A painful inflammation of the epididymis resulting from a bacterial infection.

Signup and view all the flashcards

Peyronie's disease

A form of sexual dysfunction in which the penis is bent or curved during erection.

Signup and view all the flashcards

Priapism

A painful and persistent erection that lasts four hours or more, but is either not caused by sexual excitement or does not go away after sexual stimulation has ended.

Signup and view all the flashcards

Circumcision

The surgical removal of the foreskin of the penis.

Signup and view all the flashcards

Orchiectomy

The surgical removal of one or both testicles.

Signup and view all the flashcards

Orchiopexy

The repair of cryptorchidism, which is an undescended testicle.

Signup and view all the flashcards

Study Notes

Data Wrangling with Python

  • This lab covers reading data, handling missing data, formatting data and data transformation techniques in Python.
  • The libraries needed are pandas and numpy.

Data Acquisition

  • Pandas facilitates reading data from CSV files, Excel Files, and JSON files.
  • To read CSV files:
    import pandas as pd
    df = pd.read_csv('example.csv')
    print(df.head())
    
  • To read Excel files:
    df = pd.read_excel('example.xlsx')
    print(df.head())
    
  • To read JSON files:
    df = pd.read_json('example.json')
    print(df.head())
    

Data Cleaning

  • Missing values can be handled by replacement with a specific number using df.fillna(0, inplace=True) to replace all NA values with 0.
  • Missing values can be replaced with the mean, median, or mode using df['column_name'].fillna(df['column_name'].mean(), inplace=True).
  • Rows with missing values can be dropped entirely by using df.dropna(inplace=True).
  • Change column names with df.rename(columns={'old_name': 'new_name'}, inplace=True).
  • Data types can be changed using df['column_name'] = df['column_name'].astype('int').
  • Duplicate data is removed using df.drop_duplicates(inplace=True).

Data Transformation

  • Min-Max scaling normalizes data using the formula $X_{scaled} = \frac{X - X_{min}}{X_{max} - X_{min}}$.
    df['column'] = (df['column'] - df['column'].min()) / (df['column'].max() - df['column'].min())
    
  • Z-Score standardization normalizes data using the formula $X_{standardized} = \frac{X - \mu}{\sigma}$.
    df['column'] = (df['column'] - df['column'].mean()) / df['column'].std()
    
  • Discretization converts numerical data into categorical data.
    bins = [0, 10, 20, 30, 40, 50]
    labels = ['0-10', '10-20', '20-30', '30-40', '40-50']
    df['column'] = pd.cut(df['column'], bins=bins, labels=labels, right=False)
    

Estadística Descriptiva

  • Estadística Descriptiva encompasses methods for organizing, summarizing, and presenting data informatively.

Tipos de Datos

  • Cualitativos data lacks numerical value.
    • Nominal data are categories without any order.
    • Ordinal are categories including a specific order.
  • Cuantitativos data has a numerical value.
    • Discretos data is whole numbers.
    • Continuos data is any value within range.

Medidas de Tendencia Central

  • Media: is the average ($\mu = \frac{\sum x_i}{N}$).
  • Mediana: represents the central value of the ordered data.
  • Moda: is the most frequently occurring value.

Medidas de Dispersión

  • Rango: is the difference between the maximum and minimum values.
  • Varianza: stands for the average of the squared differences from the mean ($\sigma^2 = \frac{\sum (x_i - \mu)^2}{N}$).
  • Desviación Estándar: is the square root of the variance ($\sigma = \sqrt{\frac{\sum (x_i - \mu)^2}{N}}$).

Representaciones Gráficas

  • Gráfico de Barras: Used for qualitative and discrete quantitative data.
  • Histograma: Used for continuous quantitative data.
  • Gráfico Circular: Used for showing proportions.
  • Diagrama de Caja: summarizes the distribution of data.

Exercices d'Algèbre Linéaire

  • E is a finite-dimensional vector space and $u \in \mathcal{L}(E)$ such that $rg(u) = rg(u^2)$.

  • The goal is to show that $Im(u) \cap Ker(u) = {0}$.

  • E is a finite-dimensional vector space and $u,v \in \mathcal{L}(E)$ with $u + v = id_E$ and $rg(u) + rg(v) = dim(E)$.

  • Show that $E = Im(u) \oplus Im(v)$.

  • $E = \mathbb{R}_n[X]$ is the vector space of real polynomials of degree $\leq n$: $\begin{aligned} \phi : E &\longrightarrow \mathbb{R}^{n+1}\ P &\longmapsto (P(0), P(1),..., P(n)) \end{aligned}$

    • Demonstrate that $\phi$ is an isomorphism.
    • Deduce there exists a unique polynomial $P \in E$ where $P(0) = a_0, P(1) = a_1,..., P(n) = a_n$ for all $(a_0,..., a_n) \in \mathbb{R}^{n+1}$.
  • E is a finite-dimensional vector space and $u \in \mathcal{L}(E)$ with $u^2 = -id_E$.

  • Show that $E = 2p$ and within an appropriate basis, the matrix of $u$ will be in the form: $M = \begin{pmatrix}0 & -1 & & & & & \ 1 & 0 & & & & & \ & & 0 & -1 & & & \ & & 1 & 0 & & & \ & & & & \ddots & & \ & & & & & 0 & -1 \ & & & & & 1 & 0\end{pmatrix}$

  • E is a finite-dimensional vector space and $u \in \mathcal{L}(E)$ such that $rg(u) = 1$.

  • Show that there is a scalar $\alpha \in \mathbb{K}$ with $u^2 = \alpha u$.

  • E is a $\mathbb{K}$-vector space of finite dimension.

    • Show that $rg(u + v) \leq rg(u) + rg(v)$ for $u, v \in \mathcal{L}(E)$.
    • Show that $rg(u \circ v) \leq min(rg(u), rg(v))$ for $u, v \in \mathcal{L}(E)$.
    • Show that $dim(Ker(u \circ v)) = dim(Ker(v)) + dim(Ker(u) \cap Im(v))$ for $u, v \in \mathcal{L}(E)$.
    • For $u, v \in \mathcal{L}(E)$ where $u \circ v = 0$, show that $rg(u) + rg(v) \leq dim(E)$.
    • For $u, v \in \mathcal{L}(E)$ where $u^2 = u$ and $v^2 = v$, show that $u + v = id_E \Longleftrightarrow Im(u) = Ker(v)$ and $Ker(u) = Im(v)$.
  • E is a vector space of finite dimension $n$, and $u \in \mathcal{L}(E)$:

    • The equivalent assertions include:
      • $E = Im(u) \oplus Ker(u)$
      • $Im(u) = Im(u^2)$
      • $Ker(u) = Ker(u^2)$
  • E is a vector space of finite dimension $n$, and $u \in \mathcal{L}(E)$ such that $rg(u) = 1$.

  • Show that a base of $E$ exists, where the matrix of $u$ is in the form: $\begin{pmatrix} 0 & \cdots & 0 & \alpha_1 \ \vdots & & \vdots & \vdots \ 0 & \cdots & 0 & \alpha_n \end{pmatrix}$

Statistical Inference

  • Point Estimation

    • Applies for some population parameter $\theta$
    • Is a single estimate of $\theta$ gotten by selecting a fitting statistic and doing math with the given sample.
    • The resulting number is a point estimate of $\theta$.
    • To estimate the mean $\mu$ of a population, use the sample mean:
      • $\hat{\mu} = \bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i$
      • For a sample $x_1, \dots, x_n$, the estimate of $\mu$ is:
        • $\hat{\mu} = \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i$
        • $\hat{\mu}$ is a point estimator of $\mu$, and $\bar{x}$ is a point estimate of $\mu$.
  • Unbiased Estimators

    • An estimator $\hat{\Theta}$ is an unbiased estimator of $\theta$ if $E(\hat{\Theta}) = \theta$ for every possible value of $\theta$.
    • The bias of $\hat{\Theta}$ is $E(\hat{\Theta}) - \theta$ if $\hat{\Theta}$ is not unbiased.
      • $\bar{X}$ is an unbiased estimator of $\mu$ since $E(\bar{X}) = \mu$.
      • $S^2 = \frac{1}{n-1} \sum_{i=1}^{n} (X_i - \bar{X})^2$ is an unbiased estimator of $\sigma^2$ since $E(S^2) = \sigma^2$.
  • Standard Error

    • The standard error of an estimator $\hat{\Theta}$ is its standard deviation:
      • $SE(\hat{\Theta}) = \sigma_{\hat{\Theta}} = \sqrt{V(\hat{\Theta})}$
      • The standard error of $\bar{X}$ is:
        • $SE(\bar{X}) = \sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}}$
  • mean squared error (MSE) of an estimator $\hat{\Theta}$: $MSE(\hat{\Theta}) = E[(\hat{\Theta} - \theta)^2] = V(\hat{\Theta}) + [Bias(\hat{\Theta})]^2$

The Equations of Einstein

  • These notes discuss Einstein's field equations.
  • Special relativity is reviewed, including the Lorentz transformation and the concept of spacetime in Minkowski space.
  • Special relativity is based on 2 postulates:
    • The laws of physics are the same in all inertial frames of reference.
    • The speed of light in a vacuum is the same for all observers, regardless of the motion of the light source.
  • An event is something that happens at a specific point in space and time with coordinates $(t, x, y, z)$ in frame $S$ and $(t', x', y', z')$ in frame $S'$, where $S'$ moves at a constant velocity $v$.
  • Lorentz transformation is shown below
    • $t' = \gamma(t - \frac{vx}{c^2})$
    • $x' = \gamma(x - vt)$
    • $y' = y$
    • $z' = z$
    • $\gamma = \frac{1}{\sqrt{1 - \frac{v^2}{c^2}}}$
  • A point in Minkowski space (an event) is represented by a four-vector, with index $\mu$ running from 0 to 3.
  • $x^\mu = (ct, x, y, z)$
  • The distance between two events in Minkowski space is given by the Minkowski metric:
    • $ds^2 = -c^2dt^2 + dx^2 + dy^2 + dz^2 = \eta_{\mu\nu} dx^\mu dx^\nu$
    • $\eta_{\mu\nu} = \begin{pmatrix} -1 & 0 & 0 & 0 \ 0 & 1 & 0 & 0 \ 0 & 0 & 1 & 0 \ 0 & 0 & 0 & 1 \end{pmatrix}$

Guía de inicio rápido de Markdown

  • Markdown is a lightweight markup language with plain text formatting syntax.
  • Markdown is widely used in blogs, instant messaging, forums, collaboration software, documentation, README files, and websites.

Sintaxis de Markdown

  • Headers can be written as:
    # Header 1
    ## Header 2
    ### Header 3
    #### Header 4
    ##### Header 5
    ###### Header 6
    
  • Emphasized text can be written as:
  • italic text* italic text
  • bold text* bold text
  • bold and italic text* bold and italic text
  • Ordered lists use:
    1. First item
    2. Second item
    3. Third item
    
  • Unordered lists use:
  • First item
  • Second item
  • Third item
  • Url links use:
    [Link text](URL)
    [Link Text](URL "Title")
    
  • Images use:
    ![Alt Text](URL)
    ![Alt Text](URL "Title")
    
  • Block code use:
        ```
        Code here
        ```
    
  • Block quotes use:
    > This is a block quote.
    
  • Horizontal rules use:
    ---
    

Markdown extendido

  • Tables use:
    | Column 1 | Column 2 |
    |---|---|
    | Cell 1 | Cell 2 |
    | Cell 3 | Cell 4 |
    
  • Bullet point task use:
    - [ ] Incomplete task
    - [x] Task completed
    
  • Foot notes use:
    This is an example text of a foot note[^1].
    
    [^1]: Aquí está el contenido de la nota al pie.
    

Algorithmic Game Theory

  • This is the study of multi-agent decision problems.
  • Agents act by self interest to optimize their own outcomes
  • The agents are "rational".

Examples

  • These are
    • Auctions
    • Network routing
    • Congestion games
    • Social networks
    • Elections

Topics

  • Examples Include
    • Solution Concepts
    • Efficiency of equilibria
    • Mechanism design
    • Learning in games
    • Complexity

Components of Selfish Routing Model

  • $n$ players
  • Each player $i$ controls $\alpha_i$ units of traffic
  • Each player minimizes their latency
  • Set of resources $E$ represent edges in a network
  • Each resource $e \in E$ has a latency function $l_e(x)$
    • $l_e(x)$ is anon- decreasing function of the load $x$ on edge $e$
    • Strategy for player $i$ : a path $P_i$ from $s_i$ to $t_i$ in the network
  • Total traffic on edge $e$ or $f_e = \sum_{i: e \in P_i} \alpha_i$
    • Cost to player $i$: $\sum_{e \in P_i} l_e(f_e)$

Objective function

  • The social cost calculates for $\operatorname{SC}(f) = \sum_{e \in E} f_e \cdot l_e(f_e)$
Wardrop Equilibrium Definition
  • A flow $f$ at Wardrop equilibrium means that a player cant diminish their cost by changing tactics
  • For every player $i$ and for every path $P_i$ used by the player: $\sum_{e \in P_i} l_e(f_e) \le \sum_{e \in P'_i} l_e(f_e)$

Price of Anarchy Definition

  • This measures the degradation of system performance due to selfish behavior.

  • $\mathrm{PoA} = \frac{\text{Social cost of worst-case equilibrium}}{\text{Optimal social cost}}$

  • In a network with linear latency functions, the price of anarchy is at most $\frac{4}{3}$.

Quantum Physics Equations and Concepts

  • Wave-Particle Duality

    • de Broglie wavelength can be represented as:
      • $\lambda = \frac{h}{p} = \frac{h}{mv}$
      • $\lambda$ is the wavelength
      • $h = 6.626 \times 10^{-34} Js$
      • $p$ is the momentum
      • $m$ is the mass
      • $v$ is the velocity
    • Energy of a photon can be represented as:
      • $E = h\nu = \frac{hc}{\lambda}$
      • $E$ is the energy
      • $\nu$ is the frequency
      • $c = 3.00 \times 10^8 m/s$
  • Uncertainty Principle

    • Heisenberg uncertainty principle can calculated as:
      • $\Delta x \Delta p \geq \frac{\hbar}{2}$
      • $\Delta E \Delta t \geq \frac{\hbar}{2}$
      • $\Delta x$ is the uncertainty in position
      • $\Delta p$ is the uncertainty in momentum
      • $\Delta E$ is the uncertainty in energy
      • $\Delta t$ is the uncertainty in time
      • $\hbar = \frac{h}{2\pi}$
  • Quantum Mechanics

    • Schrödinger equation:
      • $i\hbar \frac{\partial}{\partial t}\Psi(r, t) = \hat{H}\Psi(r, t)$
    • Time-independent Schrödinger equation:
      • $\hat{H}\Psi(r) = E\Psi(r)$
  • Atomic Physics

    • Bohr model:
      • $E_n = -\frac{13.6 eV}{n^2}$
      • $E_n$ is the energy of the nth level
      • $n$ is the principal quantum number
    • Rydberg formula:
      • $\frac{1}{\lambda} = R(\frac{1}{n_1^2} - \frac{1}{n_2^2})$
      • $R = 1.097 \times 10^7 m^{-1}$
      • $n_1$ and $n_2$ are integers ($n_2 > n_1$)
  • Quantum Numbers

    • Principal quantum number:
      • $n = 1, 2, 3,...$
      • Determines the energy level
    • Azimuthal quantum number:
      • $l = 0, 1, 2,..., (n-1)$
      • Determines the shape of the orbital (s, p, d, f)
    • Magnetic quantum number:
      • $m_l = -l, -l+1,..., 0,..., l-1, l$
      • Determines the orientation of the orbital in space
    • Spin quantum number:
      • $m_s = +\frac{1}{2}, -\frac{1}{2}$
      • Determines the intrinsic angular momentum of the electron (spin up or spin down)

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

More Like This

Use Quizgecko on...
Browser
Browser