Grokking Dl Andrew

From chapter 1 - 12, this book explains lots of methods from linear to cnn and rnn. I am only interested in building a autograd framework from chapter 13 and beyond. Therefore, my approach to this book is to build a framework then apply it to the rest of the book instead of avoid using framework like chapter 1-12.

Chapter 2: how do machines learn?

  • Deep Learning, Machine Learning, AI
  • Parametric models and nonparametric models
  • Supervised vs unsupervised

Chapter 3: forward propagation

  • use numpy.dot to compute matmul

Chapter 4 + 5: gradient descent

  • MSE Loss

Chapter 6: Introduction to backpropagation

  • Full, batch and SGD
  • Linear vs NonLinear
  • RELU

Chapter 7: how to picture neural networks

Chapter 8: regularization and batching

  • 3 layers network on MNIST
  • Overfitting
  • Early stopping
  • Dropout: randomly turn off neurons (set to 0) during training
  • Batch gradient descent

Chapter 9: Activations

  • Sigmoid, Tanh
  • Softmax

Chapter 10: CNN

Chapter 11: Embedding Layer

Chapter 12: RNN

  • Char-RNN

Chapter 13: Build a deep learning framework.

Tensor

class Tensor (object):

  • Version 1: wrap self.data = np.array(data)
  • Version 2:
    • add creation_op and creators to init
    • add backward method (self, grad) which will :
      • save self.grad
      • check self.creation_op to call backward on self.creators
    • in add method, add creators = [self, other] and creation_op = “add”
  • Version 3: allow a tensor to have multiple children tensor (accumulate gradients)
    • init:
      • add self.children dictionary
      • generate random id for each tensor
      • for each creators:
        • creator.children[self.id] += 1 # increase children count
    • backward:
      • decrease children count based on the id of the children
      • accumulate gradients from several children
      • if received gradients from all children:
        • backward to creators based on each creation_op
    • add:
      • if self.autograd and other.autograd -> return Tensor(autograd=True) ???
  • Version 4: add support for negation
    • neg:
      • check autograd
      • return self.data * -1 and creation_op = “neg”
    • backward:
      • add backward for creation_op “neg”
  • Version 5: add subtraction, multiplication, sum, expand, transpose, matrix multiplication

Optimizer

class SGD(object):

  • init(parameters, alpha=0.1):
  • zero(self):
    • for p in self.parameters:
      • p.grad.data *= 0
  • step(self, zero=True):
    • for p in self.parameters:
      • p.data -= p.grad.data * self.alpha
      • if(zero):
        • p.grad.data *= 0

Layer

class Layer(object):

  • init(self): self.parameters = list()
  • get_parameters(self): return self.parameters

class Linear(Layer):

  • init(self, n_inputs, n_outputs):
    • self.weight = Tensor(W, autograd=True)
    • self.bias = Tensor(np.zeros(n_outputs), autograd=True)
    • self.parameters.append(sell.weight, self.bias)
  • forward(self, input):
    • return input.mm(self.weight) + self.bias.expand(0, len(input.data))

class Sequential(Layer)

class MSELoss(Layer)

class Tanh(Layer), Sigmoid(Layer):

  • add sigmoid and tanh operation to Tensor
  • Tanh, Sigmoid layer will call the input.tanh() or input.sigmoid() in forward

class Embedding(Layer) = ???

  • need index_select operation ?

class CrossEntropyLoss(Layer):

  • add cross_entropy operation to Tensor

class RNNCell(Layer)

Chapter 14: LSTM

Chapter 15: Federated Learning

Avatar
Quan Hua
Deep Learning Enthusiast
comments powered by Disqus