OPT

  • OPT: Open Pre-trained Transformer Language Models
  • Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning
  • collection of auto-regressive/decoder-only pre-trained transformer-based language models ranging in size from 125M to 175B parameters, which we aim to fully and responsibly share with interested researchers
  • replicate the performance and sizes of the GPT-3 class of models, while also applying the latest best practices in data curation and training efficiency
  • OPT-175B is comparable to GPT-3, while requiring only 1/7th the carbon footprint to develop