Neural Probabilistic Model

  • A Neural Probabilistic Language Model
  • more compact and smoother representations based on distributed representations that can accommodate far more conditioning variables
  • learning the joint Probability function of sequences of words in a language was intrinsically difficult because of the Curse Of Dimensionality
  • learning a distributed representation for words which allows each training sentence to inform the model about an exponential/combinatorial number of semantically neighboring sentences
  • The model learns simultaneously (i) a distributed representation for each word along with (ii) the Probability function for word sequences, expressed in terms of these representations
  • Generalization is obtained because a sequence of words that has never been seen before gets high Probability if it is made of words that are similar
  • significantly improves on state-of-the-art n gram models