Today's popular bug finders catch only about two percent of the vulnerabilities lurking in software code, researchers have found, despite the millions of dollars companies spend on them each year.
Bug finders are commonly used by software engineers to root out problems in code that could turn into vulnerabilities. They'll typically report back how many bugs they found -- what you don't know is how many were missed, leaving success rates an open mystery.
So researchers at New York University's Tandon School of Engineering in collaboration with the MIT Lincoln Laboratory and Northeastern University decided to find out how much they are missing.
LAVA, or Large-Scale Automated Vulnerability Addition, is a technique created by the researchers to test the limits of bug-finding tools in order to help developers improve them. It does that by intentionally adding vulnerabilities to a program’s source code.
“The only way to evaluate a bug finder is to control the number of bugs in a program, which is exactly what we do with LAVA,” said Brendan Dolan-Gavitt, an assistant professor of computer science and engineering at NYU Tandon.
The system inserts known quantities of novel vulnerabilities that are synthetic yet possess many of the same attributes as computer bugs in the wild. It's automated, so it avoids the cost of manual, custom-designed vulnerabilities.
Instead, LAVA makes targeted edits in real programs’ source code to create hundreds of thousands of unstudied, highly realistic vulnerabilities that span the execution lifetime of a program, are embedded in normal control and data flow, and manifest only for a small fraction of inputs so as to avoid shutting the entire program down.
When tested with existing bug-finding software representing both the "fuzzing" and symbolic-execution approaches commonly used today, just two percent of the bugs created by LAVA were detected. This summer, the team plans to launch an open competition to allow developers and other researchers to request a LAVA-bugged version of a piece of software, attempt to find the bugs, and receive a score based on their accuracy.
“There has never been a performance benchmark at this scale in this area, and now we have one,” Dolan-Gavitt said. “Developers can compete for bragging rights on who has the highest success rate in bug-finding, and the programs that will come out of the process could be stronger.”
A paper detailing the research was presented recently at the IEEE Symposium on Security and Privacy and published in the conference proceedings.