Computer languages are like their real-life counterparts: They constantly evolve. But unique to the evolution of programming languages is the ability to expressly fork them -- to publicly announce a desire to branch off and deviate from the lineage. Sometimes the forks are temporary, with the new branch rejoining and influencing its parent. Other times, a useful variation of an existing language arises and is sustained. Or the mutation takes off, and an entirely new language is born.
The desire to tinker and innovate is only one reason to change a computer language. Another major impetus is that any programming language will in time show its limits, whether in the language itself or in its implementation. Those evolutionary pressures drive users to either change it for the better or to leave it behind for another option.
Most language forks evolve in one of three ways:
- As an entirely new, potentially incompatible branch of the language
- As a new language that compiles down to the original
- As a superset or subset of the original language, with features added or removed
Here we explore some of the more vibrant examples of each approach currently evolving today.
A new language: PHP and Hack
PHP's sheer popularity is both its blessing and its curse. The upside: Applications developed in the language are all but guaranteed to run anywhere. The curse? PHP's curious quirks and internal inconsistencies won't likely be ironed out soon, lest the changes break backward compatibility with much existing PHP code.
The changes Hack brought to PHP demonstrate why a language fork can be appealing. Major changes to the language can be implemented without having to wait for approval from a steering committee or governing body. A proposal to add type hinting to PHP recently passed, but it might be a while before it lands in the actual language, let alone be used in production code. With Hack, those features can be used right now.
The downside of any fork is that it's likely to be backward-incompatible, meaning any code using the original language might not work. Hack provides a partial solution to this limitation by running on a virtual machine, HHVM, which also supports PHP -- allowing both languages to be deployed side by side on the same interpreter. In this way, an existing PHP codebase can be deployed alongside a newly minted Hack codebase, with the old deprecated over time in favor of the new.
How do you fork a language without forking the language itself? Create a new language that compiles down to the old one. The original language's limitations, typically its syntax, can be kept at arm's length from the programmer.
Subsets and supersets: Python
Subsets of Python generally exist as a way to address Python performance -- a language with fewer features is easier to optimize. RPython, the language used by the PyPy Python implementation, is "a restricted subset of Python that is amenable to static analysis" and provides stricter controls over what type a variable can be at any given time. The resulting code can be optimized far more readily by the PyPy JIT compiler than by Python itself.
Just as there are subsets, there are also supersets -- versions of a language that tack on features to broaden what can be done with it. Cython, another Python derivative, adds ways to generate C code directly from Python code, allowing a programmer to accelerate a Python program's performance by way of C.
Rarely does the likes of Cython or RPython generate the same level of interest as the parent language. In Cython's case, it appeals mainly to people combining C with Python. If you're not doing that, there's little incentive to use it.
Sometimes, with supersets and subsets alike, features will bubble up (or down) into the main language. With static typing in Python, for instance, there's now a proposal in the works to add type hinting to Python 3, as a way to make it easier to profile code -- and perhaps eventually as a way to accelerate its performance overall.
Future candidates for a fork
What other widely used languages might be destined for a fork in the near future?
One candidate is Google's Go, aka Golang. High-profile projects such as Docker have been built with it, and the language has enjoyed attention and accolades. But several of its features and behaviors have as many detractors as they do adherents; Go's error-handling mechanism, for instance, is one feature that's been singled out for criticism. Lack of generics is another commonly cited shortcoming, and the Go development team has insisted that generics will not be added to the language. If Go's designers are unwilling to reconsider their stance on such facets -- all signs point to that being the case -- a fork of the language might be the only way forward for the disgruntled.
Another possibility would be variations on Microsoft's family of .Net languages, mainly C#, made more possible by Microsoft's new generation of open source compilation frameworks. This development would be distinct from projects like Mono, a separate open source implementation of C# and .Net. Rather, it would be an attempt to take C# in new directions, whether they were compatible with the original or not.
One final possibility, though it's more of a fork of a specification than a language, is the next major version of HTML. In some ways this already happened, as the WHATWG and HTML5 could be considered forks from the W3C and its version of the standard. There's no guarantee such a fork would change the landscape, even if it came with a browser to run it, but that's part of the risk -- and reward -- of forking in the first place.