Log Likelihood Loss

  • k is size of context window of past tokens