Subhaditya's KB

Home

❯

KB

❯

AI

❯

Machine Learning

❯

Models

❯

TinyBERT

TinyBERT

Sep 18, 20241 min read

  • architecture

TinyBERT

  • TinyBERT: Distilling BERT for Natural Language Understanding
  • novel Transformer distillation method to accelerate inference and reduce model size while maintaining accuracy
  • specially designed for Knowledge Distillation (KD) of the Transformer-based models
  • plenty of knowledge encoded in a large teacher BERT can be effectively transferred to a small student Tiny-BERT
  • GLUE

Graph View

Backlinks

  • AutoDistill
  • _Index_of_Models
  • architecture

Created with Quartz v4.3.1 © 2025

  • GitHub