Width Efficiency of Neural Networks

 
  • there exist classes of wide, shallow networks that can only be expressed by narrow networks with polynomial depth
  • polynomial lower bound on width is less restrictive than the exponential lower bound on depth, suggesting that depth is more important
  • the price for making the width small is only a linear increase in the network depth for networks with ReLU activation