## Learning Objectives

• Greek Symbols (reference material)

• For each greek letter: (all lowercase plus upper-case gamma, delta, theta, xi, pi, sigma, upsilon, phi, psi, omega):
• I can provide the name (and case) when given the symbol
• I can provide the symbol given the name (and case)
• NumPy (reference material)

• I can convert a Python array to a NumPy array
• I can use broadcasting to operate on a NumPy array with a NumPy array of fewer dimensions.
• I can do pointwise addition or multiplication of NumPy arrays
• I understand NumPy array shapes:
• I can change the shape of an array
• I can explain the difference between an array of shape (5,), an array of shape (5, 1), and an array of shape (1, 5)
• I can stack two NumPy arrays
• I can use indexing and slicing to extract parts of an array
• I can create a NumPy array with random elements
• I can create a NumPy array with specified datatype.
• I can use NumPy for matrix multiplication.
• Gradient Descent (Day 2, Day 3; fast.ai: 149-163)

• I Can explain each of the pieces of the gradient descent loop:
• Theta
• x
• y
• f
• y-hat
• Loss function
• Optimizer
• I Can label a gradient descent loop diagram with each of the pieces
• I can run one iteration of the gradient descent algorithm by hand (given f(x), and the gradient)
• I can explain the difference between full-batch gradient descent, stochastic gradient descent and minibatch gradient descent and can explain the pros and cons of each
• General ML (Day 2, Day 8; Aggarwal: 1.4.1; fast.ai: 28-30)

• I can identify why overfitting occurs, how it can be identifed, and the ways in which it can be fixed
• I can identify why underfitting occurs, how it can be identified, and the ways in which it can be fixed
• I can explain the use of training, validation and test datasets
• Optimizers (Day 8; Aggarwal 3.5.1-3.5.3; fast.ai: 473-480)

• I can explain the use of and give code for the following optimizers
• Plain SGD
• With weight decay
• With momentum
• With Nesterov Momentum
• RMSProp
• I can explain which optimizers use learning rates and how learning rates are chosen
• Loss functions (fast.ai: 194-203, 226-237)

• I can provide code for the following loss functions and describe when each would be used:
• Cross Entropy
• Mean Squared Error (MSE)
• Binary Cross Entropy
• Negative Log Likelihood (NLL)
• L1 Error
• Activation functions (Aggarwal 1.2.1.3)

• I can provide the equation for each of the following activation functions, along with the equation for the derivative, and can identify which should be used in a given situation:
• ReLU
• Leaky ReLU
• Tanh
• Softmax
• LogSoftmax
• Sigmoid
• Neural Networks (Day 10, 11; Aggarwal 1.2.1.3-1.2.3, 1,3, 1.4.2, 3.2, 3.4)

• Given a Neural Network and its input, I can calculate the output
• Given a Neural Network and its input, I can calculate the partial derivative of the loss with respect to any given parameter
• I can show how exploding gradients can occur and methods to address them
• I can show how vanishing gradients can occur and methods to address them
• Transfer Learning (Day 16; Aggarwal 8.4.7; fast.ai: fast.ai: 30-33, 207-212)

• Given a pretrained model, I can explain how to repurpose it for a new task, by removing the old head and adding a new head
• I can describe the process of finetuning
• Regularization (Day 6, 15; Aggarwal 1.4.1.1, 3.6, 4.4, 4.5.1.2, 4.5.4-4.5.5, 4.6)

• I can identify the use of and implement standard regularization techniques:
• Smaller batchsizes
• Batch normalization
• Dropout
• Weight Decay
• Data Augmentation
• Early Stopping
• Ensembles
• Larger learning rate
• Label smoothing
• Mixup
• CNNs (Day 14-17; Aggarwal 3.5.5, 8.1-8.2.6, 8.4; fast.ai: Chapters 13-14)

• I can explain how residual networks work and explain how they can address the vanishing gradient problem
• I can compute the output of a convolutional kernel
• I can use stride and padding to control the output size of a convolutional layer
• I can compute the number of weights in a convolutional layer
• I can compute the output of a max or average or adaptive pooling layer
• I can describe the architectures of ResNet and Inception
• RNNS (Day 21-25; Aggarwal 7.2.1-7.2.4, 7.5-7.6; fast.ai: Chapter 12)

• I can explain how RNNs work
• I can compute output from an LSTM or GRU
• I can give the equations for an LSTM or GRU
• TBD
• Transformers (Days 26-27)

• TBD