- What is underflow and overflow?
- How to tackle the problem of underflow or overflow for softmax function or log softmax function?
- What is poor conditioning?
- What is the condition number?
- What are grad, div and curl?
- What are critical or stationary points in multi-dimensions?
- Why should you do gradient descent when you want to minimize a function?
- What is line search?
- What is hill climbing?
- What is a Jacobian matrix?
- What is curvature?
- What is a Hessian matrix?