Knowledge Distillation Teacher model to help train the student model Teacher is often pre trained Student tries to imitate teacher Distillation Loss Knowledge Distillation Survey 2021 Distilling the Knowledge in a Neural Network