Neural Probabilistic Model

A Neural Probabilistic Language Model
more compact and smoother representations based on distributed representations that can accommodate far more conditioning variables
learning the joint Probability function of sequences of words in a language was intrinsically difficult because of the Curse Of Dimensionality
learning a distributed representation for words which allows each training sentence to inform the model about an exponential/combinatorial number of semantically neighboring sentences
The model learns simultaneously (i) a distributed representation for each word along with (ii) the Probability function for word sequences, expressed in terms of these representations
Generalization is obtained because a sequence of words that has never been seen before gets high Probability if it is made of words that are similar
significantly improves on state-of-the-art n gram models

Subhaditya's KB