Skip to main content

DARPA's quest for automated bug-hunters is heating up

DARPA's quest for automated bug-hunters is heating up

Share this story

On June 3rd, DARPA held the qualifying round of its Cyber Grand Challenge, an ambitious hunt for software that can find and patch bugs. The qualifying round was a 24-hour round of furiously automated bug-hunting and patching, the culmination of years of careful team building and research. Now, after more than a month of meticulous number-crunching, the organizers are finally releasing the results — and like any DARPA challenge, the attrition rate is staggering.

Starting from more than 100 teams, two preliminary challenges narrowed the field to just 28 teams competing in the qualifier. Fifteen of those suffered catastrophic failures as soon as the programs were set loose. Seven of the surviving 13 teams have now been chosen as finalists, and will be provided with free supercomputer time and a full year to refine their methods before being pitted against each other in a battle royale at 2016's Defcon conference, vying for a $2 million grand prize.

Vying for a $2 million grand prize

In short, the game is on.

Three of the teams got seed funding from DARPA, and they otherwise run the gamut — some teams came from academic researchers at UC Berkeley and Carnegie Mellon, while others sprung out of huge corporations like Raytheon and Grammatech. The scrappiest team is CSDS, a two-person team from the University of Idaho that built its entire program from scratch, without any pre-established components. Other teams had been successful in bug-hunting contests, but had to learn automation on the fly. The hope is that, once the challenge is done, those groups will share in each other's expertise. "What's going to emerge from this is a security automation community," says DARPA project manager Mike Walker, who organized the challenge.

The biggest relief for Walker is that the projects worked at all, proving out the basic idea of automated bug-patching. But beyond the existential concerns, the finalists made a remarkably good showing. The test software had 590 known bugs, with an untold number of as-yet-undiscovered ones, and the best team was able to patch 261 of them. Even more impressive, every known bug was found by at least one of the teams, and every single round unearthed a new vulnerability.

"We held the world's biggest CTF and all the contestants were robots."

This round was just a static test: the computers tackled more problems than a human game would allow, but it was still just a single round, with no opportunity for opponents to seek out holes in the newly patched software. At next year's Defcon, these seven teams will have to survive dozens of rounds played out in real time on a live network, with opponents monitoring their traffic in search of a competitive advantage. Working that fast will take a lot of computing power, so for the next year, the teams will also have access to DARPA's cloud computing resources, with 1,000 Xeon cores and 16 terabytes of RAM allocated for each team.

In the long term, the project could have profound implications for computer security, allowing for smarter and more responsive network monitors. But for now, it's just an automated version of a classic hacker capture-the-flag contest, in which teams compete to find and patch bugs before opponents can exploit them. As Walker proudly points out, it's already breaking records. "There's never been an automated capture-the-flag game, and there's never been a capture-the-flag game anywhere near this big," says Walker, describing the latest round. "We held the world's biggest CTF and all the contestants were robots."