Hermann Weyl
“The introduction of numbers as coordinates is an act of violence.”
Difference between power and exponential vs.
Why is machine learning / DL essentially just linear algebra all about representing data as numbesr ‘models’ are just complex mathematicaly expressions, operations (quote andrej())
Dense vectors and matrices Cardinality
Write down a vector using your letter of choice.
Notation
The reason I ask, is because you should make sure your are consistent; your interviewers will be consciously or subsconsciously assessing every aspect of your methodoloy. So make sure you use tildas, bold, or whatever. Just be consistent
What is a scalar?
What is a vector?
https://youtu.be/fNk_zzaMoSs?si=X_wVsCTjs3ZMW-Ge
Physics: arrows pointing in space; length and direction pointing (any two are the same vector)
CS Student: lists of numbers, with dimensions = number of components. You could model a house as two dimensions, square feet and price, and thus any house can be represented by a single vector with two components.
Mathematics: Generalize both these things. A vector can be anything that can be added or multiplied?
What is a matrix?
A matrix is a grid of numbers. The dimensions of a matrix are usually denoted:
What is the rank of matrix
Matrix Rank: The rank of a matrix is the maximum number of linearly independent row or column vectors in the matrix. It reflects the dimension of the vector space spanned by its rows or columns.
What is
Difference between a coordinate point and a vector.
- Point in space vs. a vector
- Denoted vs
What is the difference between a vector and a matrix?
What is a Unit Vector?
- Explanation: A unit vector is a vector with a magnitude of 1. It represents the direction of a vector without considering its magnitude.
What is a dot product?
Dot product
- How to calculate it: just sum the products of each component
- What it means: The dot product is a measure of how much one vector extends in the direction of another vector.
- () If the dot product is positive, the vectors are pointing in generally the same direction.
- () If it’s negative, they’re pointing in opposite directions.
- () A zero dot product indicates that the vectors are perpendicular.
- Also known as scalar product: because it always outputs a scalar (number) instead of a vector. This is good to keep in mind.
Cosine similarity
- Another interpretation is: the dot product is the product of their magnitudes and the cosine of the angle between them
- This naturally leads to cosine similarity, which, like the dot product, measures the similarity of two vectors but considering only their direction.
- We lose information on magnitude when we normalize by dividing by the magnitudes, but in return we get a measure of similarity that is always between
“Um, what is a dot product? Okay so, the dot product of two vectors provides a measure of their similarity in terms of magnitude and direction, representing the sum of their element-wise products. Cosine similarity, derived from the dot product, normalizes this by the magnitudes of the vectors, effectively measuring the cosine of the angle between them. It’s crucial in ML for gauging similarity in high-dimensional spaces, like in text analysis or recommendation systems, where the direction of the vectors (representing features or items) matters more than their magnitude.”
What is an inner product?
What is an outer product?
What is Vector Projection?
- Explanation: The vector projection of (\mathbf{a}) onto (\mathbf{b}) is a vector that represents the component of (\mathbf{a}) in the direction of (\mathbf{b}).
What is Scalar Projection?
- Explanation: The scalar projection is the length of the vector projection, representing how much of one vector goes in the direction of another.
- This also leads to another interpretation of the dot product, that is:
- The dot product can be seen as the product of the magnitude of one vector and the projection of the other vector onto the first. This is especially useful in physics for calculating work done when a force is applied along a distance.
What is the Magnitude (Norm) of a Vector?
- Explanation: This formula calculates the length or size of the vector.
What is the Cross Product?
- Explanation: The cross product of two vectors results in a vector that is perpendicular to both and is used in 3D space to determine the area of parallelograms and the direction of normals.
How is Matrix Addition and Subtraction Defined?
- Explanation: Matrix addition/subtraction combines or contrasts the elements of two matrices of the same dimensions.
What is Matrix-Vector Multiplication?
- Explanation: This operation transforms the vector (\mathbf{v}) by the matrix (A), often representing a change of basis or a linear transformation.
What is Matrix-Matrix Multiplication?
- Explanation: Matrix multiplication combines two matrices, transforming one matrix by another, often used to represent a series of linear transformations.
What is the Determinant of a Matrix?
- Explanation: The determinant provides information about the scaling factor of the linear transformation represented by the matrix and its invertibility.
What is the Inverse of a Matrix?
- Explanation: The inverse of a matrix undoes the transformation represented by the matrix, applicable only for square matrices with non-zero determinants.
What is the Trace of a Matrix?
- Explanation: The trace is the sum of eigenvalues and represents the sum of all linear transformations’ scaling factors.
What is the Rank of a Matrix?
- Explanation: The rank indicates the number of linearly independent rows or columns in a matrix, reflecting the dimensionality of the vector space spanned by its rows or columns.
How are Linear Equations Represented in Matrix Form?
- Explanation: This represents a system of linear equations, where (A) contains the coefficients, (\mathbf{x}) is the vector of variables, and (\mathbf{b}) is the vector of constants.
What does it mean for something to be a ‘linear combination’ of something else?
What makes 2 or more vectors ‘linearly independent’
A set of vectors is said to be linearly independent if no vector in the set can be written as a linear combination of the others. In other words, the only solution to the equation
is when all the coefficients are zero. Here, is the zero vector, and are the vectors in the set. Linear independence is a measure of whether a set of vectors spans a vector space without any “extra” vectors that are unnecessary for the span.
Perpendicularity (Orthogonality):
Two vectors are perpendicular or orthogonal if their dot product is zero:
This means that they meet at a right angle (90 degrees) to each other. In higher dimensions, a set of vectors is orthogonal if every pair of different vectors in the set is orthogonal to each other. Orthogonality is a stronger condition than linear independence; an orthogonal set of non-zero vectors is always linearly independent, but the converse is not necessarily true.
What’s the difference between linearly independent and perpendicular?
- Linear Independence: Does not require any specific angle between vectors. The vectors simply must not be expressible as a linear combination of each other.
- Perpendicularity: Specifically requires that vectors are at a 90-degree angle to each other.
☞ Linear independence is about the capacity of a set of vectors to provide a unique basis for a vector space, whereas perpendicularity is about the geometric relationship between pairs of vectors.
☞ Orthogonal vectors are always linearly independent, but linearly independent vectors are not necessarily orthogonal.
☞ For example, in a two-dimensional plane, any two non-parallel vectors are linearly independent, but they are only perpendicular if they meet at a right angle.
Can you explain the difference between a vector and a matrix?
How do you perform matrix multiplication and what are the conditions for it to be possible?
What is the significance of an identity matrix in linear algebra?
What are eigenvalues and eigenvectors, and why are they important in machine learning?
Can you describe a practical machine learning scenario where eigenvalues and eigenvectors are used?
What is a tranpose?
What is pre/post multiplying
##### What is a 'basis'
What is a linear transformation
Do the percentage flip thing and ask to prove how
What is the form of a weighted average?
Normalized weight (0-1) and
What is a tensor?
State the mathematical derivation of the eigenvalue and eigenvector. Explain how to find the eigenvalues of a matrix.
At its core:
Where:
- is a square matrix.
- is an eigenvector of
- is the eigenvalue corresponding to the eigenvector
What is it actually saying?
This equation says that you can post-multiply matrix by the vector v, and the result is a scalar multiple of vector v again.
Explain why its so important for Machine Learning and Deep Learning
- General optimisation;
- Regularisation
- This equation is used in PCA in order to determine the principal components that capture the most variance in the data.
- Finding the steady state distribution of a Markov Chain
Mathematical Derivation of Eigenvalues and Eigenvectors
Question Headline: Derive the concept of eigenvalues and eigenvectors, and elucidate the method to find the eigenvalues of a matrix.
Introduction to Eigenvalues and Eigenvectors:
Eigenvalues and eigenvectors are fundamental concepts in linear algebra, playing a crucial role in various applications such as stability analysis, quantum mechanics, and machine learning algorithms like PCA (Principal Component Analysis).
- Definition:
- Let (A) be a square matrix. An eigenvector (v) of (A) is a non-zero vector such that when (A) is multiplied by (v), the direction of (v) remains unchanged. The scalar (\lambda) associated with this operation, which scales the eigenvector, is known as the eigenvalue.
where ( A ) is our matrix, ( \mathbf{v} ) is the eigenvector, and ( \lambda ) is the eigenvalue.
-
Rearranging the Equation:
- To find the eigenvalues, we rearrange the equation as:
- Since ( \lambda\mathbf{v} ) can be written as ( \lambda I\mathbf{v} ), where ( I ) is the identity matrix of the same dimension as ( A ), the equation becomes:
-
Non-Trivial Solution:
- For a non-trivial (non-zero) solution for ( \mathbf{v} ), the determinant of must be zero.
- You should know that this is because
- Thus, we have: [ \det(A - \lambda I) = 0 ]
- Solving this equation, known as the characteristic equation, gives us the eigenvalues ( \lambda ).’
- For a non-trivial (non-zero) solution for ( \mathbf{v} ), the determinant of must be zero.
https://chat.openai.com/share/7be6e202-894c-4c3a-a016-3766d9229d03
Finding Eigenvalues:
-
Characteristic Polynomial:
- Calculate the determinant of ( (A - \lambda I) ), where ( A ) is your square matrix, and ( I ) is the identity matrix of the same dimension. This will give you a polynomial in terms of ( \lambda ), known as the characteristic polynomial.
-
Solving the Polynomial:
- Solve the characteristic polynomial for ( \lambda ). The solutions to this polynomial equation are the eigenvalues of matrix ( A ).
-
Example:
- Consider a 2x2 matrix ( A = \begin{bmatrix} a & b \ c & d \end{bmatrix} ). The characteristic equation is: [ \det\left(\begin{bmatrix} a-\lambda & b \ c & d-\lambda \end{bmatrix}\right) = 0 ]
- Expanding the determinant and solving for ( \lambda ) will yield the eigenvalues.
Callout Box: New Concept - Determinant
The determinant of a square matrix ( A ), denoted as ( \det(A) ), is a scalar value that provides important information about the matrix, including whether it is invertible and its scaling factor for volumes.
↳ Follow-up Question: How can one determine the eigenvectors corresponding to the calculated eigenvalues?
By substituting each eigenvalue ( \lambda ) back into the equation ( (A - \lambda I)\mathbf{v} = 0 ) and solving for the vector ( \mathbf{v} ), we can find the eigenvectors corresponding to each eigenvalue. This involves solving a system of linear equations.
What is a vector?
QX) What is invariance?
Certainly! Here’s a concise explanation of invariance in statistics, computer science, and machine learning:
-
Statistics:
- Invariance Principle: It relates to estimators that retain their properties even when subjected to certain transformations.
- Example: Consider the median as an estimator of central location. If you add a constant to each data point in a dataset, the median of the transformed data will be the median of the original data plus the constant. This makes the median invariant to shifts.
- The variance is LOCATION INVARIANT (sketch)
- The MLE is invariant:
- A function of a sufficient statistic is also sufficient
- Variance in invariant
- Formula: If (M) is the median of dataset (X), then the median of (X + c) (where (c) is a constant) is .
-
Machine Learning:
- Invariance: A model’s ability to provide consistent outputs despite certain transformations to the input.
- Example: A convolutional neural network (CNN) used for image recognition might be designed to recognize a cat whether it’s zoomed-in, rotated, or shifted, making the CNN invariant to these transformations.
- Formula: Let (f) be a model. If for a transformation (T) (like rotation or scaling), then (f) is invariant to (T).
I hope this provides a clear understanding of invariance in these domains!
What does it mean for something to be ‘proportional to’ something else?
- covariance matrices
- eigenvalues/vectors - what is the logic behind
: The eigenvector (a vector) : The matrix which acts on to ‘transform it’ : The eigenvalue (a scalar number)
So you find and for a given matrix , such that transforming (multiplying) the vector by does not change its direction, only its magnitude (by an amount .
↳ What determines the number of eigenvectors a matrix can have?
- A square matrix of size has at most linearly independent eigenvectors.
Explain why square matrices are often interpreted as ‘transformations’.
Transformation
In linear algebra, a transformation refers to a function that maps vectors to other vectors. More formally, it’s a rule that assigns each vector in one vector space to a vector in the same or another vector space. In the context of matrices, especially square matrices, these transformations can be thought of as operations that alter the space in which the vectors lie.
-
PCA
-
Vector space
-
What Python library would you use for matrix manipulation
-
How are matrices used in page rank
-
How can matrices be Used to measure the similarity of two images
-
How are matrices used in advanced recommender systems
-
How is linear algebra used in a CNN?
- The kernel slide thing (do a dot product practice)
https://youtu.be/ZTt9gsGcdDo?si=f8LVG7Vliir7Dcyw
- What is broadcasting in connection to Linear Algebra?
- What are scalars, vectors, matrices, and tensors?
- What is Hadamard product of two matrices?
- What is an inverse matrix?
- If inverse of a matrix exists, how to calculate it?
- What is the determinant of a square matrix? How is it calculated (Laplace expansion)? What is the connection of determinant to eigenvalues?
- Discuss span and linear dependence.
- What is Ax = b? When does Ax =b has a unique solution?
- In Ax = b, what happens when A is fat or tall?
- When does inverse of A exist?
- What is a norm? What is L1, L2 and L infinity norm?
- What are the conditions a norm has to satisfy?
- Why is squared of L2 norm preferred in ML than just L2 norm?
- When L1 norm is preferred over L2 norm?
- Can the number of nonzero elements in a vector be defined as L0 norm? If no, why?
- What is Frobenius norm?
- What is a diagonal matrix? (D_i,j = 0 for i != 0)
- Why is multiplication by diagonal matrix computationally cheap? How is the multiplication different for square vs. non-square diagonal matrix?
- At what conditions does the inverse of a diagonal matrix exist? (square and all diagonal elements non-zero)
- What is a symmetrix matrix? (same as its transpose)
- What is a unit vector?
- When are two vectors x and y orthogonal? (x.T * y = 0)
- At R^n what is the maximum possible number of orthogonal vectors with non-zero norm?
- When are two vectors x and y orthonormal? (x.T * y = 0 and both have unit norm)
- What is an orthogonal matrix? Why is computationally preferred? (a square matrix whose rows are mutually orthonormal and columns are mutually orthonormal.)
- What is eigendecomposition, eigenvectors and eigenvalues?
- How to find eigen values of a matrix?
- Write the eigendecomposition formula for a matrix. If the matrix is real symmetric, how will this change?
- Is the eigendecomposition guaranteed to be unique? If not, then how do we represent it?
- What are positive definite, negative definite, positive semi definite and negative semi definite matrices?
- What is SVD? Why do we use it? Why not just use ED?
https://youtu.be/P5mlg91as1c?si=nLyAHZgeLP3ExNYM
- Given a matrix A, how will you calculate its SVD?
- What are singular values, left singulars and right singulars?
- What is the connection of SVD of A with functions of A?
- Why are singular values always non-negative?
- What is the Moore Penrose pseudo inverse and how to calculate it?
- If we do Moore Penrose pseudo inverse on Ax = b, what solution is provided is A is fat? Moreover, what solution is provided if A is tall?
- Which matrices can be decomposed by ED? (Any NxN square matrix with N linearly independent eigenvectors)
- Which matrices can be decomposed by SVD? (Any matrix; V is either conjugate transpose or normal transpose depending on whether A is complex or real)
- What is the trace of a matrix?
- How to write Frobenius norm of a matrix A in terms of trace?
- Why is trace of a multiplication of matrices invariant to cyclic permutations?
- What is the trace of a scalar?
- Write the frobenius norm of a matrix in terms of trace?