Functional languages rack up best scores for software quality

A study of GitHub projects and the languages used to build them finds that certain language characteristics are more likely to result in better software

Language design makes a difference in software quality, and functional languages offer an edge when it comes to building quality software, a study of programming languages and code quality in GitHub reveals.

Researchers at the University of California, Davis, recently published their findings, which were based on an examination of projects hosted on GitHub and the languages used to build them. All told, the researchers studied 729 projects and 80 million lines of code, including project metadata about bugs, covering 17 top languages ranging from C to C++, Java to JavaScript, and Scala to Clojure.

"By triangulating findings from different methods and controlling for confounding effects, such as team size, project size, and project history, we report that language design does have a [statistically] significant but modest effect on software quality," the report states. "Most notably, it does appear that strong typing is modestly better than weak typing, and among functional languages, static typing is also somewhat better than dynamic typing. We also find that functional languages are somewhat better than procedural languages." But the report noted the modest effects of language design are "overwhelmingly dominated by the process factors, such as project size, team size, and commit size."

Functional languages like Clojure, Scala, and Haskell scored the best for software quality; TypeScript, a typed superset of JavaScript, also did well. Balshakhi Ray, a postdoctoral researcher at the university who participated in the study, said in an interview that functional languages were boosted by their reliance on being mathematical and the likelihood that more experienced programmers use them.

Programming errors account for about 88.53 percent of all bug fix commits and happen in all language classes, but some programming errors are more language-specific. "For example, we find 122 runtime errors in JavaScript that are not present in TypeScript," the study says. "In contrast, TypeScript has more type-related errors, since TypeScript compiler flags them during development."

Memory errors, meanwhile, accounted for 5.44 percent of bug fix commits. Regression analysis "confirms that languages with unmanaged memory type, e.g.,C, C++, and Objective-C introduce more memory errors." Among managed languages, Java "has significantly more memory errors than the average, though its regression coefficient is less than the unmanaged."

When it comes to security and other impact errors, about 7.33 percent of bug fix commits are related to these issues. Erlang, C, C++, and the Go language produce more security errors than average, while projects written in Clojure and Erlang produce fewer than average.

About 2 percent of bug fix commits were related to concurrency errors. In all the languages, race conditions are the most-frequent concurrency error, ranging from 41 percent in C++ to 92 percent in Go. "The enrichment of race condition errors in Go is likely because Go is distributed with a race-detection tool that may advantage Go developers in detecting races."

Join the TechWorld newsletter!

Error: Please check your email address.

Tags application developmentGitHubsoftware

More about inventorScala

Show Comments
[]