Perplexity

  • Perplexity is defined as the exponentiated average negative log-likelihood of a sequence.
  • If we have a tokenized sequence , then the perplexity of is, where is the log-likelihood of the ith token conditioned on the preceding tokens according to our model.
  • Intuitively, it can be thought of as an evaluation of the model’s ability to predict uniformly among the set of specified tokens in a corpus.
  • Importantly, this means that the tokenization procedure has a direct impact on a model’s perplexity which should always be taken into consideration when comparing different models.
  • This is also equivalent to the exponentiation of the Cross Entropy between the data and model predictions