Alphacode

  • Other language models have demonstrated an impressive ability to generate code, but these systems still perform poorly when evaluated on more complex, unseen problems
  • Alphacode is a system for code generation for problems that require for deeper reasoning
  • having an extensive dataset for training and evaluation, large and ecient transformer based architectures and a large-scale model sampling.
  • model is firstly pre-trained through GitHub repositories amounting to 715.1 GB of code.
  • more extensive dataset than Codex’s pre training dataset.
  • For the training to be better, a fine-tuning dataset is introduced from the Codeforces plataform
  • Codecontests are conducted, for the validation phase, in which we better the performance of the model.
  • transformer-based architecture, they use an encoder-decoder transformer architecture
  • Compared to decoder-only architectures commonly used, this architecture allows for a bidirectional description and extra flexibility.
  • shallow encoder and a deep encoder to further the model’s ecienc
  • o reduce the cost of sampling, multi-query attention is used.