Software rules the world. He controls smartphones, nuclear weapons and car engines. But there is a global shortage of programmers. Wouldn’t it be nice if someone could explain what they want a program to do, and a computer could translate that into lines of code?
According to a new study, a new artificial intelligence (AI) system called AlphaCode brings humanity closer to this vision. The researchers say the system – from the DeepMind research lab, a subsidiary of Alphabet (Google’s parent company) – could one day help experienced coders, but is unlikely to replace them.
“It’s very impressive, the performance they are able to achieve on fairly difficult problems,” says Armando Solar-Lezama, head of the computer-aided programming group at the Massachusetts Institute of Technology.
AlphaCode goes beyond the previous flagship of writing AI code: Codex, a system released in 2021 by the nonprofit research lab OpenAI. The lab had previously developed GPT-3, a “big language model” capable of mimicking and interpreting human text after being trained on billions of words from e-books, Wikipedia articles and others. internet text pages. By refining GPT-3 on over 100 gigabytes of code from Github, an online software repository, OpenAI created Codex. Software can write code when prompted with a daily description of what it is supposed to do, such as counting vowels in a text string. But it performs poorly when loaded with tricky issues.
The creators of AlphaCode focused on solving these difficult problems. Like the Codex researchers, they started by feeding a large multi-gigabyte language model of code from GitHub, just to familiarize it with syntax and coding conventions. Then they trained it to translate problem descriptions into code, using thousands of problems collected from programming contests. For example, a problem might ask a program to determine the number of binary strings (sequences of zeros and ones) of length n that do not have consecutive zeros.
When confronted with a new problem, AlphaCode generates candidate code solutions (in Python or C++) and filters out the bad ones. But whereas researchers had previously used models like Codex to generate tens or hundreds of candidates, DeepMind had AlphaCode generate up to over a million.
To filter them out, AlphaCode first keeps only the 1% of programs that pass the test cases that come with the problems. To further narrow the field, it groups holders based on the similarity of their outputs to the invented inputs. Then it submits each cluster’s programs, one at a time, starting with the largest cluster, until it either finds a successful cluster or reaches 10 submissions (about the maximum that humans submit in competitions). Submitting different clusters allows him to test a wide range of programming tactics. This is the most innovative step in the AlphaCode process, says Kevin Ellis, a computer scientist at Cornell University who works on AI coding.
After training, AlphaCode solved about 34% of assigned issues, DeepMind reports this week in Science. (On similar criteria, Codex scored a single-digit pass percentage.)
To further test its prowess, DeepMind entered AlphaCode into online coding competitions. In contests with at least 5,000 participants, the system outperformed 45.7% of the programmers. The researchers also compared his programs with those in his training database and found that he did not duplicate large sections of code or logic. It generated something new, a creativity that surprised Ellis.
“It continues to be impressive how well machine learning methods work when you scale them,” he says. The results are “breathtaking”, adds Wojciech Zaremba, co-founder of OpenAI and co-author of their Codex article.
AI coding could have applications beyond winning competitions, says Yujia Li, computer scientist at DeepMind and co-author of the paper. It could do a lot of software work, freeing up developers to work at a higher or more abstract level, or it could help non-coders create simple programs.
David Choi, another author of the study at DeepMind, imagines running the model in reverse: translating code into explanations of what it does, which could benefit programmers trying to understand other people’s code. “There’s a lot more you can do with models that understand code in general,” he says.
For now, DeepMind wants to reduce system errors. Li says that even though AlphaCode generates a working program, it sometimes makes simple mistakes, like creating a variable and not using it.
There are other issues. AlphaCode requires tens of billions of trillions of operations per problem, computing power that only the biggest technology companies have. And the problems he solved from online programming contests were narrow and self-contained. But real-world programming often requires managing large packages of code in multiple places, requiring a more holistic understanding of the software, says Solar-Lezama.
The study also notes the long-term risk of software that improves recursively. Some experts say such self-improvement could lead to super-intelligent AI taking over the world. While this scenario may seem remote, researchers still want the field of AI coding to institute built-in safeguards, checks and balances.
“Even if this type of technology is very successful, you would want to treat it the same way you treat a programmer within an organization,” says Solar-Lezama. “You never want an organization where a single programmer could bring down the whole organization.”
#learns #write #computer #code #mindblowing #lead