Podcast
Questions and Answers
What is the main purpose of technical drawings?
What is the main purpose of technical drawings?
- To serve as a general visual guide without specific measurements.
- To illustrate the aesthetic qualities of a design.
- To provide precise and detailed information for manufacturing or construction. (correct)
- To create artistic representations of objects.
Which drawing type is best suited for illustrating how parts of an assembly fit together?
Which drawing type is best suited for illustrating how parts of an assembly fit together?
- Schematic drawing
- Perspective drawing
- Orthographic projection
- Isometric drawing (correct)
What type of information is typically included in a detailed parts list associated with a technical drawing?
What type of information is typically included in a detailed parts list associated with a technical drawing?
- The historical context of the design.
- Materials, quantities, and part numbers. (correct)
- Marketing slogans related to the product.
- A subjective evaluation of the design's aesthetics.
How does the use of standardized symbols in technical drawings aid in communication?
How does the use of standardized symbols in technical drawings aid in communication?
When scaling technical drawings, what consideration is most critical to maintain?
When scaling technical drawings, what consideration is most critical to maintain?
Which of the following is a key advantage of using Computer-Aided Design (CAD) software over manual drafting?
Which of the following is a key advantage of using Computer-Aided Design (CAD) software over manual drafting?
If a technical drawing shows an object with dimensions in a ratio of 1:2, what does this indicate?
If a technical drawing shows an object with dimensions in a ratio of 1:2, what does this indicate?
What is the purpose of section views in technical drawings?
What is the purpose of section views in technical drawings?
Why is it important to include tolerance information in technical drawings?
Why is it important to include tolerance information in technical drawings?
How does understanding the principles of orthographic projection assist in interpreting technical drawings?
How does understanding the principles of orthographic projection assist in interpreting technical drawings?
Flashcards
What is technical drawing?
What is technical drawing?
Technical drawing is a means to express ideas instead of words and pens. Technical drawing is one of the basic means of communication between people.
What is an exploded drawing?
What is an exploded drawing?
An exploded drawing is a three-dimensional drawing that you can see the dimensions of the piece or the product as a whole, showing the general features for the device or the piece without referring to the materials used or the dimensions or measurements.
What is an orthographic drawing?
What is an orthographic drawing?
Orthographic drawing is a two-dimensional drawing of a single piece in a different shape showing the sizes of the piece accurately.
what is the size of A4 paper?
what is the size of A4 paper?
Signup and view all the flashcards
Renewable Energy
Renewable Energy
Signup and view all the flashcards
What is the 'Axis'?
What is the 'Axis'?
Signup and view all the flashcards
Study Notes
Understanding Deep Neural Networks
- Deep Neural Networks (DNNs) have shown great success across various fields.
- The architecture, training, and interpretability are key aspects of understanding DNNs.
DNN Architecture
- DNNs are composed of multiple layers, each transforming input data in a specific way.
- Input Layer: Receives the initial, unprocessed data.
- Hidden Layers: Apply non-linear transformations to the input data.
- Output Layer: Generates the final result/prediction.
Activation Functions
- Activation functions introduce non-linearity, which allows the network to learn intricate patterns.
- Sigmoid: $\sigma(x) = \frac{1}{1 + e^{-x}}$
- ReLU (Rectified Linear Unit): $f(x) = max(0, x)$
- Tanh (Hyperbolic Tangent): $tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}$
Parameters
- Weights (W) and biases (b) are associated with each layer.
- Weights (W) and biases (b) get refined throughout the training process.
Training DNNs
- A loss function evaluates the difference between the predicted output and the actual output.
- Mean Squared Error (MSE): $MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2$
- Cross-Entropy: $H(p, q) = -\sum_{x} p(x) log(q(x))$
Optimization Algorithms
- These algorithms fine-tune the network's parameters to reduce the extent of the loss function.
- Gradient Descent: Adjusts parameters in the opposite direction of the loss function's gradient.
- Adam: An adaptive optimization algorithm that combines the benefits of AdaGrad and RMSProp.
Regularization
- These techniques prevent overfitting by adding a penalty term to the loss function.
- L1 Regularization: Adds the sum of the absolute values of the weights to the loss function.
- L2 Regularization: Adds the sum of the squares of the weights to the loss function.
Interpretability
- Visualizations of hidden layers' activations provides insights into the network's learning.
- Saliency Maps: Highlight input features that the network considers most relevant for predictions.
- Explainable AI (XAI): Techniques that aim to make DNNs more transparent and understandable.
Conclusion
- Deep Neural Networks are powerful tools for solving complex problems.
- Knowledge of their architecture, training, and interpretability increases understanding.
Comparison of Estimator Properties
- Estimators are statistical functions that estimate population parameters from sample data.
- Key properties to evaluate estimators include bias, variance, MSE, convergence, consistency, and efficiency.
Bias
- Defined as: $bias(\hat{\theta}) = E(\hat{\theta}) - \theta$
- $E(\hat{\theta})$ is the expected value of the estimator.
- $\theta$ is the true value of the parameter.
- An estimator is unbiased if $bias(\hat{\theta}) = 0$.
- An estimator is biased if $bias(\hat{\theta}) \neq 0$.
Variance
- Defined as: $variance(\hat{\theta}) = E[(\hat{\theta} - E(\hat{\theta}))^2]$
- Variance measures the spread of estimates around their mean.
- Lower variance indicates higher precision.
Mean Squared Error (MSE)
- Defined as: $MSE(\hat{\theta}) = E[(\hat{\theta} - \theta)^2]$
- MSE measures the overall quality of an estimator, considering both bias and variance.
- $MSE(\hat{\theta}) = variance(\hat{\theta}) + bias(\hat{\theta})^2$.
- Lower MSE values are preferred.
Convergence
- An estimator $\hat{\theta}_n converges to $\theta$ if $P(|\hat{\theta}_n - \theta| > \epsilon) \rightarrow 0$ as $n \rightarrow \infty$.
- $\epsilon$ is any positive number.
- Convergence ensures the estimator approaches the true parameter value as the sample size grows.
Consistency
- An estimator $\hat{\theta}_n$ is consistent if it converges in probability toward $\theta$.
- Consistency is a desirable property since it indicates the estimator nears the true value as the sample size increases.
Efficiency
- An estimator $\hat{\theta}_1$ is more efficient than $\hat{\theta}_2$ if $variance(\hat{\theta}_1) < variance(\hat{\theta}_2)$.
- Efficiency measures the accuracy of an estimator compared to other possible estimators.
- An efficient estimator has the smallest variance among all possible estimators.
Summmary
- Biais: $biais(\hat{\theta}) = E(\hat{\theta}) - \theta$
- Variance: $variance(\hat{\theta}) = E[(\hat{\theta} - E(\hat{\theta}))^2]$
- MSE: $MSE(\hat{\theta}) = E[(\hat{\theta} - \theta)^2]$
- Convergence: $P(|\hat{\theta}_n - \theta| > \epsilon) \rightarrow 0$ as $n \rightarrow \infty$
- Consistency: $\hat{\theta}_n$ converges in probability toward $\theta$
- Efficiency: $variance(\hat{\theta}_1) < variance(\hat{\theta}_2)$
Lecture 17: Orthogonality
Definition of Orthogonality
- Vectors $\mathbf{v}$ and $\mathbf{w}$ in $\mathbb{R}^n$ are orthogonal if their dot product is zero: $\mathbf{v} \cdot \mathbf{w} = 0$
Orthogonality to a Subspace
- A vector $\mathbf{v}$ is orthogonal to a subspace $W$ of $\mathbb{R}^n$ if it is orthogonal to every vector in $W$.
- The set of all vectors orthogonal to $W$ is the orthogonal complement of $W$, denoted as $W^{\perp}$
Theorem: Orthogonal Complement is a Subspace
- $W^{\perp}$ is a subspace of $\mathbb{R}^n$
Example: Finding a Basis for $W^{\perp}$
- Given $W = \text{Span} \left{ \begin{bmatrix} 1 \ 2 \ 1 \end{bmatrix}, \begin{bmatrix} 2 \ -1 \ 0 \end{bmatrix} \right}$, find a basis for $W^{\perp}$
- $W^{\perp} = { \mathbf{v} \in \mathbb{R}^3 : \mathbf{v} \cdot \mathbf{w} = 0 \text{ for all } \mathbf{w} \in W }$
- If $\mathbf{v} = \begin{bmatrix} x \ y \ z \end{bmatrix}$, then $\mathbf{v} \in W^{\perp}$ if and only if
- $\begin{bmatrix} x \ y \ z \end{bmatrix} \cdot \begin{bmatrix} 1 \ 2 \ 1 \end{bmatrix} = 0$ and $\begin{bmatrix} x \ y \ z \end{bmatrix} \cdot \begin{bmatrix} 2 \ -1 \ 0 \end{bmatrix} = 0$
- This leads to the system of equations:
- $x + 2y + z = 0$
- $2x - y = 0$
- Solving for $x$ and $y$ in terms of $z$ yields $x = -\frac{1}{5}z$ and $y = -\frac{2}{5}z$. Thus:
- $ \begin{bmatrix} x \ y \ z \end{bmatrix} = \begin{bmatrix} -\frac{1}{5}z \ -\frac{2}{5}z \ z \end{bmatrix} = z \begin{bmatrix} -\frac{1}{5} \ -\frac{2}{5} \ 1 \end{bmatrix} $
- A basis for $W^{\perp}$ is $\left{ \begin{bmatrix} -1 \ -2 \ 5 \end{bmatrix} \right}$
Theorem: Relationship Between Row Space and Null Space
- For an $m \times n$ matrix $A$, $(\text{Row } A)^{\perp} = \text{Nul } A$
- $\mathbf{x} \in (\text{Row } A)^{\perp}$ if and only if $\mathbf{x}$ is orthogonal to each row of $A$, which is true if and only if $A\mathbf{x} = \mathbf{0}$, meaning $\mathbf{x} \in \text{Nul } A$
Theorem: Orthogonal Complement of the Orthogonal Complement
- If $W$ is a subspace of $\mathbb{R}^n$, then $(W^{\perp})^{\perp} = W$
Theorem: Decomposition of $\mathbb{R}^n$
- If $W$ is a subspace of $\mathbb{R}^n$, then $\mathbb{R}^n = W \oplus W^{\perp}$
- Every vector $\mathbf{v} \in \mathbb{R}^n$ can be uniquely expressed as $\mathbf{v} = \mathbf{w} + \mathbf{u}$, where $\mathbf{w} \in W$ and $\mathbf{u} \in W^{\perp}$
Orthogonal Projection
- $\mathbf{w}$ is the orthogonal projection of $\mathbf{v}$ onto $W$, denoted as $\text{proj}_W \mathbf{v}$.
- $\mathbf{u}$ is the component of $\mathbf{v}$ orthogonal to $W$.
Lecture 19
I. Classification of Problems
- Classification, Regression, Clustering
II. Classification
- Supervised learning
Examples Including
- Determine type of object in an image
- Determine if an email is spam
- Determine if loan applicant will default
- Given data $x_i$ along with labels $y_i$, learn a function to predict $y$ from $x$
Binary Classification
- Two classes: $y_i \in {-1, +1}$
- Learn a function: $f(x) = {-1, +1}$
- Define a real-valued function $h(x)$
- If $h(x) > 0$, predict $+1$
- Else predict $-1$
- $f(x) = sign(h(x))$
- Thus, want to learn the function $h(x)$
Linear Classifier
- The simplest option is a linear function
Example
- $h(x) = w^T x + b = w_1 x_1 + w_2 x_2 + b$
Answers
- A line ($n=2$)
- A plane ($n=3$)
- A hyperplane ($n>3$)
Geometric Interpretation
- The function $h(x)$ is positive on one side of the line/plane/hyperplane and negative on the other side - Decision boundary
- The vector $w$ is normal to the decision boundary
Learning
- Given training data ${x_i, y_i}$, how do we find $w$ and $b$?
Many Approaches
- Perceptron
- Logistic Regression
- Support Vector Machine
Perceptron
- Simple algorithm that was one of the first machine learning algorithms invented (1950's)
Goal
- Find a $w$ and $b$ that correctly classify all the training data
- $w^T x_i + b > 0$ if $y_i = +1$
- $w^T x_i + b < 0$ if $y_i = -1$
Perceptron Learning Algorithm
- Initialize $w$ and $b$ to zero
- Loop through the training data
- If $x_i$ is misclassified:
- $w \leftarrow w + y_i x_i$
- $b \leftarrow b + y_i$
- Repeat steps 2-3 until all data is correctly classified
A Proof
- Let's assume that the data is linearly separable - That is, there exists some $w^$ and $b^$ such that:
- $y_i (w^{T} x_i + b^) \geq \rho > 0$ for all $i$
- $\rho$ is the margin - how far away the data is from the decision boundary
Proof Continued
- We also want to show that $w_k$ does not grow too fast
- $||w_{k+1}||^2 = ||w_k + y_i x_i||^2$
Final Steps
- The number of mistakes is bounded by $(\frac{R}{\rho})^2$
Problems with Perceptron
- Only works if the data is linearly separable
- Sensitive to outliers
- We need a more powerful algorithm
- That is less sensitive to outliers
Coming
- Logistic Regression
- Support Vector Machine
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.