Chapter 4 - Deep Neural Networks

 

Intro

  • Deep networks : more than one hidden layer
  • Both deep and shallow describe piecewise linear mappings from input output
  • A shallow network can describe complex functions but might have too many hidden layers to be practically possible to use.

Why did deep learning take off

  • Krizhevsky et al. (2012) aka ImageNet
  • Larger datasets
  • Improved processing power for training
  • Relu
  • SGD

Composing networks

General deep neural networks

  • Consider a deep network with 2 hidden layers, each of which has 3 hidden units

How does the network deal with complicated functions?

Hyperparameters

  • Capacity : total number of hidden units
  • NNs represent a family of families of functions relating input to output

Width

  • Number of hidden units in each layer ()

Depth

  • Number of hidden layers ()

Matrix notation to represent deep networks

Shallow vs deep networks

Other notes