Deepmind Introduces ‘AlphaCode’: A Code Generation System With Advanced Machine Learning Applied To Solving Competitive Programming Problems

Computer programming has become a general-purpose problem-solving tool in our daily lives, industries, and research centers. Yet, it has been proven difficult to incorporate AI breakthroughs to developing systems to make programming more efficient and accessible. Large-scale language models have recently exhibited a remarkable ability to create code and complete simple programming tasks. However, these models perform poorly when tested on more difficult, unknown issues that need problem-solving skills beyond translating instructions into code. 

Creating code that performs a specified goal necessitates searching through a large structured space of programs with a sparse reward signal. That is why competitive programming tasks require knowledge of algorithms and complicated natural language, which remain highly difficult.

Large transformer models can attain low single-digit solve rates in early work employing program synthesis for competitive programming. However, they can’t reliably provide solutions for the vast majority of problems. Furthermore, insufficient test cases in existing competitive programming datasets make the metrics unreliable for measuring research progress.

To that end, DeepMind’s team has introduced AlphaCode, a system for writing competitive computer programs. AlphaCode generates code unprecedentedly using transformer-based language models and then intelligently filters to a small group of interesting programs. By tackling new challenges that involve a combination of critical thinking, logic, algorithms, code, and natural language interpretation, AlphaCode ranked in the top 54 percent of competitors in programming competitions.

All of the models used are pre-trained on GitHub’s open-source code that included code files from several popular languages: C++, C#, Go, Java, JavaScript, to name a few. Then, they were fine-tuned on a dataset of programming competition dataset CodeContests. This dataset gathers data from various sources, splits it temporally so that all training data predates all evaluation issues, includes additional generated tests to check correctness, and evaluates submissions in a competitive programming environment. 

The team describes the competitive programming code generation problem as a sequence-to-sequence translation task, which produces a corresponding solution Y in a programming language when given a problem description X in natural language. This perception motivated them to use an encoder-decoder transformer architecture for AlphaCode, which models. The problem description X is fed into the encoder as a flat series of letters by the architecture (including metadata, tokenized). It samples Y autoregressively from the decoder one token at a time until it reaches the end of the code token, at which point the code can be built and run.


An encoder-decoder design provides bidirectional description representation (tokens at the beginning of the description can attend to tokens at the end). It also offers more flexibility to separate the encoder and decoder structures. The researchers also discovered that utilizing a shallow encoder and a deep decoder enhances training efficiency without negatively impacting problem solution rates.

Follow the below steps while using AlphaCode:

  1. Pre-train a transformer-based language model with conventional language modeling objectives using GitHub code. 
  2. Use GOLD with tempering as the training objective to fine-tune the model on CodeContests.
  3. For each challenge, generate a large number of samples from the present models.
  4. Using the example tests and clustering to identify samples based on program behavior, filter the samples to get a small set of candidate submissions (at most ten) to be tested on the concealed test cases.

The researchers evaluated their model using many C++ and Python programs for each challenge. Further, they filtered, clustered, and reranked the resulting solutions down to a small group of 10 candidate programs for external evaluation. They collaborated with Codeforces and tested AlphaCode by replicating participation in ten recent contests. This automated system replaces rivals’ trial-and-error debugging, compilation, testing, and submission processes.