Association Rule Mining

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is the Italian word for 'explanation'?

  • la società
  • la faccia
  • la forma
  • la spiegazione (correct)

Which of the following Italian words means 'to present' or 'to introduce'?

  • ricevere
  • presentare (correct)
  • approfondire
  • dichiarare

What does 'la faccia' mean in English?

  • The form
  • The gift
  • The face (correct)
  • The society

What is the Italian word for 'society'?

<p>la società (A)</p> Signup and view all the answers

Which Italian word corresponds to 'to receive'?

<p>ricevere (D)</p> Signup and view all the answers

What does 'il regalo' mean?

<p>The gift (B)</p> Signup and view all the answers

Which of the following is the Italian term for 'humanity'?

<p>L'umanità (B)</p> Signup and view all the answers

What does 'il pianeta' refer to?

<p>The planet (D)</p> Signup and view all the answers

What is the meaning of 'dichiarare'?

<p>To declare (B)</p> Signup and view all the answers

Which of the following Italian words means 'to accompany'?

<p>accompagnare (A)</p> Signup and view all the answers

Flashcards

La spiegazione

The explanation

Approfondire

To deepen; to go further into

La faccia

The face

La forma

The form

Signup and view all the flashcards

La società

The society

Signup and view all the flashcards

La memoria

The memory

Signup and view all the flashcards

Il regalo

The gift

Signup and view all the flashcards

Il Natale

Christmas

Signup and view all the flashcards

Archeologico

Archeological

Signup and view all the flashcards

Il contenuto

The content

Signup and view all the flashcards

Study Notes

Association Rule Mining

  • Association rule mining discovers rules predicting item occurrences based on other items in a sales transaction database.
  • An association rule is an implication like $X \Rightarrow Y$, where $X$ and $Y$ are itemsets and do not intersect.
  • Association rules indicates that customers who purchase tires and auto accessories also get automotive service done 98% of the time: $(Tires, Accessories) \Rightarrow (Automotive Service)$

Association Rule Definition

  • $I = {i_1, i_2, \dots , i_m}$, the itemset represents all items
  • $D$ is the database of transactions, where each transaction has a unique ID and consists of a subset of items from $I$
  • An implication has the form $X \Rightarrow Y$, where $X, Y \subset I$ and $X \cap Y = \emptyset$

Market Basket Analysis Example

  • Given a transaction ID, the corresponding record of items bought is recorded.
  • Example rules include:${A} \Rightarrow {C}$, ${B} \Rightarrow {C}$, and ${A, B} \Rightarrow {C}$.

Rule Evaluation Metrics: Support

  • Support measures the percentage of transactions containing both $X$ and $Y$.
  • $s(X \Rightarrow Y) = \frac{\text{Number of transactions containing X and Y}}{\text{Total number of transactions}}$

Rule Evaluation Metrics: Confidence

  • Confidence measures how often items in Y appear in transactions that contain X.
  • $c(X \Rightarrow Y) = \frac{\text{Number of transactions containing X and Y}}{\text{Number of transactions containing X}}$

Example Calculations

  • $Support({𝐴} \Rightarrow {𝐶})$ is calculated as 2/5 = 0.4.
  • $Confidence({𝐴} \Rightarrow {𝐶})$ is calculated as 2/3 = 0.67.

Association Rule Mining Task

  • Task seeks to find all association rules $X \Rightarrow Y$ that meet minimum support and confidence thresholds.
  • This involves finding rules where support $(X \Rightarrow Y) \geq$ min_sup and confidence $(X \Rightarrow Y) \geq$ min_conf.

Two Step Approach

  • First find the frequenct itemsets.
  • Then generate high confidence rules

Frequent Itemset Generation

  • An itemset is frequent if its support is greater than or equal to the minsup threshold.

Brute-Force Approach and Complexity

  • Every itemset in the lattice gets considered as a candidate.
  • All possible itemsets ($2^N - 1$) are generated, and the support for each one is counted by scanning the database.
  • This approach has a complexity of $O(NMw)$. Where where $N$ is transactions, $M$ is item size, and $w$ is total number of cadidates.

Frequent Itemset Generation Strategies

  • Reducing the number of candidates ($M$) can be done via Complete search (Apriori, Eclat, and FP-growth) or Focused search (MaxMiner, Pincer Search and H-Mine).
  • Number of transactions ($N$) can also bereduced (Partition, and Sampling)
  • For efficient calculations, the number of comparisons ($NM$) can be reduced by using efficient data structures.
  • Reducing the overhead of each comparison ($M$)

The Apriori Principle

  • Subsets of frequent itemsets are frequent, while supersets of infrequent itemsets are infrequent.
  • If {A,B} is a frequent itemset. This indicates that both {A} and {B} are also frequent itemsets.
  • If {A} is NOT a frequent itemset, neither is {A,B}

Apriori Algorithm Steps

  • Initially, frequent itemsets of length 1 are generated, which is iterative.
  • Algorithm Iteratively generates candidate itemsets:
    • Candidates with infrequent subsets are pruned.
    • The support for each candidate is counted, and infrequent candidates are eliminated.

Apriori Example

  • With a $\text{min_sup} = 2$, frequent itemsets from a transaction database are shown.
  • {A}, {B}, {C}, {D}, {E}, and {F} meet the minimum support requirement.

Rule Generation Steps

  • For each frequent itemset, generate all non-empty subsets.
  • For every non-empty subset $s$ of frequent itemset $L$, generate the rule $s \Rightarrow (L-s)$ if $\frac{\text{support(L)}}{\text{support(s)}} \geq \text{min_conf}$

Rule Generation Example

Assume $\text{min_conf} = 0.7$ and $L = {A, B, C}$ and

  • support({A, B, C})$ = 0.1$
  • support({A, B}) = $0.3$
  • support({A, C}) = $0.4$
  • support({B, C}) = $0.2$
  • support({A}) = $0.8$
  • support({B}) = $0.5$
  • support({C}) = $0.3$

The resulting rules are:

  • {A, B} $\rightarrow$ {C} (Confidence is 0.33)
  • {A, C} $\rightarrow$ {B} (Confidence is 0.25)
  • {B, C} $\rightarrow$ {A} (Confidence is 0.5)
  • {A} $\rightarrow$ {B, C} (Confidence is 0.125)
  • {B} $\rightarrow$ {A, C} (Confidence is 0.2)
  • {C} $\rightarrow$ {A, B} (Confidence is 0.33)

Apriori Summary

  • Apriori reduces search space using Apriori principle
  • Once frequent itemsets are identified, association rules are produced.
  • Association rule mining can be used to learn about items commonly purchased together.

Implicit Differentiation

Example 1 - Finding $\frac{dy}{dx}$

  • Given $x^2 + y^2 = 25$, differentiating both sides with respect to $x$ yields $2x + 2y \cdot \frac{dy}{dx} = 0$.
  • Solving for $\frac{dy}{dx}$ gives $\frac{dy}{dx} = -\frac{x}{y}$.

Alternate Solution

  • The equation $y = \pm \sqrt{25 - x^2}$ can be solved explicitly, leading to the same derivative.
  • $\frac{dy}{dx} = \pm \frac{-x}{\sqrt{25 - x^2}} = \frac{-x}{y}$

Example 2-Tangents

  • Equation of the tangent line to the ellipse $x^2 - xy + y^2 = 3$ at the point $(-1, 1)$ is found using implicit differentiation.
  • Differentiating yields $2x - (1 \cdot y + x \cdot \frac{dy}{dx}) + 2y \frac{dy}{dx} = 0$.
  • Slope is $\frac{dy}{dx} = \frac{y - 2x}{2y - x}$.
  • At $(-1, 1)$, the slope $\frac{dy}{dx} = 1$, so the equation of the tangent line is $y = x + 2$.

Example 3

  • To find $\frac{d^2y}{dx^2}$ if $x^4 + y^4 = 16$, differentiate implicitly to find $\frac{dy}{dx} = -\frac{x^3}{y^3}$.
  • Differentiate a second time using the quotient rule and substitute the first derivative to find $\frac{d^2y}{dx^2} = -\frac{3x^2y^4 + 3x^6}{y^7}$.

Linear Algebra

Space Definition

  • A vector space $E$ over a field $\mathbb{K}$ (usually $\mathbb{R}$ or $\mathbb{C}$), has two operations: vector addition and scalar multiplication.
  • Addition vectorielle : $E \times E \rightarrow E$, indicated by $(u, v) \mapsto u + v$.
  • Multiplication scalaire : $\mathbb{K} \times E \rightarrow E$, indicated by $(\lambda, u) \mapsto \lambda u$.

Eight axioms that the operations must satisfy:

  • Associativité de l'addition : $\forall u, v, w \in E, (u + v) + w = u + (v + w)$
  • Commutativité de l'addition : $\forall u, v \in E, u + v = v + u$
  • Élément neutre de l'addition : $\exists 0_E \in E, \forall u \in E, u + 0_E = u$
  • Élément inverse de l'addition : $\forall u \in E, \exists -u \in E, u + (-u) = 0_E$
  • Distributivité scalaire par rapport à l'addition vectorielle : $\forall \lambda \in \mathbb{K}, \forall u, v \in E, \lambda(u + v) = \lambda u + \lambda v$
  • Distributivité scalaire par rapport à l'addition dans le corps : $\forall \lambda, \mu \in \mathbb{K}, \forall u \in E, (\lambda + \mu)u = \lambda u + \mu u$
  • Associativité de la multiplication scalaire : $\forall \lambda, \mu \in \mathbb{K}, \forall u \in E, \lambda(\mu u) = (\lambda \mu)u$
  • Élément neutre de la multiplication scalaire : $\forall u \in E, 1_{\mathbb{K}} u = u$

Common Examples for Vector Spaces

  • $\mathbb{R}^n$ : The set of n-tuples of real numbers with component-wise addition and scalar multiplication.
  • $\mathbb{C}^n$ : The set of n-tuples of complex numbers
  • $\mathbb{K}[X]$ : set of polynomials with coefficient
  • $\mathcal{M}_{n, m}(\mathbb{K})$ : Ensemble des matrices de taille $n \times m$ à coefficients dans $\mathbb{K}$.
  • $\mathcal{F}(X, \mathbb{K})$ : Ensemble des fonctions de $X$ vers $\mathbb{K}$.

Vector Subspaces

  • $F$ is a subset of a vector space $E$.
  • Must be non-empty; $F \neq \emptyset $
  • Verify that $0_E \in F$, has stable vector addition: $\forall u, v \in F, u + v \in F$
  • Stable with scalar multiplication: $ \forall \lambda \in \mathbb{K}, \forall u \in F, \lambda u \in F $.
  • An equivalent condition is; $F \neq \emptyset$ can be shown as $\forall u, v \in F, \forall \lambda \in \mathbb{K}, \lambda u + v \in F$.

Linear Combinations

  • Given vectors $v_1, \dots, v_n$, combine linearly using $\sum_{i=1}^n \lambda_i v_i$, ($ \lambda_i \in \mathbb{K}$).
  • $\text{Vect}(v_1, \dots, v_n)$ is a vector subspace comprised of all linear combinations of $v_1, \dots, v_n$.
  • Generate by family $(v_1, \dots, v_n)$.
  • Written as $E \text{Vect}(v_1, \dots, v_n) = E$

Linear independence

  • $(v_1, \dots, v_n)$ is free if the only combination of non-zero vectors that can give is the zero vector.
  • It can be written as $\lambda_1 v_1 + \dots + \lambda_n v_n = 0_E \implies \lambda_1 = \dots = \lambda_n = 0$.
  • If the equation is NOT free, consider doing it another way.

Bases and Dimension

  • The family of vectors must be both free and be generating at the same time.
  • Finte admit, every base $E$ has the same number of terms/elements: $\dim(E)$.
  • Admit NOT (or not finite), dimension infinite

Important theorems

  • From every gernating family, create a extract base. Every free family can be fully completed in one base (Théorème de la base incomplète).
  • $F$ is smaller than $E$, means $Dim(F) \leq Dim(E)$. $Dim(F) = Dim(E)$ and finite, then $F=E$.

Sums of Vector Subspaces

  • $E$ from $G$ and $F$, two vector subspaces.
  • Can be written as $F + G = {u + v \mid u \in F, v \in G}$. Another way to describe this is that it's a vector subsection of $E$.
  • The sum is a direct sum if $F \cap G = {0_E}$, written as : $F \oplus G$.
  • The $G \oplus G$, (direct sums), have uinus written forms ($F + G$), write a vector from $F + G$ of a vecotor with $F$ and a vetory with $G$ (from two sides!)
  • The relationship is as follws $E = F \oplus G$, If $F$ and $G$ are not the same. Consider these elements to be SUPPLEMENTAL.

Grassmann Formula

  • If $F$ and $G$ are finite dimensinal vector subsets: $\dim(F + G) = \dim(F) + \dim(G) - \dim(F \cap G)$.
  • IN Particular $\dim(F \oplus G) = \dim(F) + \dim(G)$

Chemical Kinetics

Reaction Rates

  • Rate indicates aA + bB -> cC + dD.
  • Equation: Rate $= -\frac{1}{a} \frac{d[A]}{dt} = -\frac{1}{b} \frac{d[B]}{dt} = \frac{1}{c} \frac{d[C]}{dt} = \frac{1}{d} \frac{d[D]}{dt}$

Rate Law

  • $Rate = k[A]^x[B]^y$
    • $k$ refers to rate constant
    • Each rate is independent from other rate elements present
    • x, y are w.r.t A, B
    • (x+y) is overall reaction rate

Integrated Rate Laws

  • Order, Rate Law, Integrated Rate, Linear Plot, Slope, Half-Life will vary depending on what elements are present
  • 0 order follows $Rate = k$ and $[A]_t = -kt + [A]_0$, which yields $[A]_t$ vs t plot
  • 1 order follows $ln[A]_t = -kt + ln[A]_0 yields$ $ln[A]_t$ vs t plot
  • 2 order follows $Rate = $ k$[A]^2 and (1/[A]_t = kt + 1/[A]_0 yield$ 1/[A]_t vs t plot

Collision Theory

  • Reaction to occur, Molecules react with:
    • sufficient energy
    • Proper Orientation

Arrhenius Equation

  • $k = Ae^{\frac{-E_a}{RT}}$

$ln(k_1) - ln(k_2) = \frac{E_a}{R}(\frac{1}{T_2} - \frac{1}{T_1})$

  • $k$ = rate constant
  • $A$ = Frequency factor
  • Temperature in kelvin
  • $R = 8.314 J/mol \cdot K$

Reation Mechanisms

  • Elementary step = a single step in a reaction
  • Molecutarlity = amount of moleucles within it's elementary
    • Unimoleuclarity is 1
    • Bimolecutlity is 2
    • Termolecutlity is 3
  • Rate Determinig step = reaction's slowest step
  • Intermediattes : formed from one, consumed in a later one (NOT OVERALL REACTION)
  • Catalysts - is CONsumed, then formed (NO IN OVERALL ACTION

Vectorial Functions

Parametric Curves

  • Represents a function vector with $r: I \subseteq \mathbb{R} \rightarrow \mathbb{R}^n$ where each $t$ in the interval $I$, corresponding to vector r(t)$
  • $r = (f(t), g(t)$.

Example/Representation

  • $x = t2 and y = t3 describe the plane with eqaution "t",
  • If continuous $I$, point $(x,y)$ is $(f(t),g(t))$, "t' is referred to as parametric.
  • If $r(t) = (t^2, t^3)$, determine curve carteesian eauation.
  • Equation from curve: $t = \sqrt{x}$ or other $y = (\sqrt{x})^3 = x^{3/2}$.

Function's derivation

  • $r'(t) = lim (h-> 0) (r(t+h) - r(t))/h$
  • provided the limit exists
  • vector tangent

Function's Integral

  • $\int \overrightarrow{r}(t) dt = \left( \int f(t) dt, \int g(t) dt \right) + \overrightarrow{C}$ r(t) = (f(t), g(t)).
  • $\overrightarrow{C}$ : Integration vector
  • $\overrightarrow{C}$ is also a vector constant

Arco's extent

  • Formula: $\qquad s = \int_{a}^{b} ||\overrightarrow{r}'(t)|| dt = \int_{a}^{b} \sqrt{(f'(t))^2 + (g'(t))^2} dt$

Curve

  • Can be written as; $\overrightarrow{T}(t) = \frac{\overrightarrow{r}'(t)}{||\overrightarrow{r}'(t)||}$
  • Formula: $\qquad K = \frac{||\overrightarrow{T}'(t)||}{||\overrightarrow{r}'(t)||}$

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

More Like This

Association Rule Mining
15 questions

Association Rule Mining

GroundbreakingByzantineArt avatar
GroundbreakingByzantineArt
Association Rules in Data Mining
43 questions
Use Quizgecko on...
Browser
Browser