The A-Z of Programming Languages: Haskell

Simon Peyton-Jones tells us why he is most proud of Haskell's purity, type system and monads.
Simon Peyton-Jones

Simon Peyton-Jones

Computerworld is undertaking a series of investigations into the most widely-used programming languages. Previously we have spoken to Alfred v. Aho of AWK fame, S. Tucker Taft on the Ada 1995 and 2005 revisions, Microsoft about its server-side script engine ASP, Chet Ramey about his experiences maintaining Bash, Bjarne Stroustrup of C++ fame, and Charles H. Moore about the design and development of Forth. We’ve also had a chat with the irreverent Don Woods about the development and uses of INTERCAL, as well as Stephen C. Johnson on YACC, Luca Cardelli on Modula-3, Walter Bright on D, Brendan Eich on JavaScript, Guido van Rossum about Python and most recently Prof. Roberto Ierusalimschy about the design and development of Lua.

In this in-depth interview, we chat with Simon Peyton-Jones about the development of Haskell. Peyton-Jones is particularly interested in the design, implementation, and application of lazy functional languages, and speaks in detail of his desire to ‘do one thing well’, as well as his current research projects being undertaken at Microsoft Research in Cambridge, UK.

Please note that due to popular demand we are no longer following alphabetical order for this series. If you wish to submit any suggestions for programming languages or language authors you would like to see covered, please email

Was Haskell created simply as an open standard for purely functional programming languages?

Haskell isn’t a standard in the ISO standard sense – it’s not formally standardized at all. It started as a group of people each wanting to use a common language, rather than having their own languages that were different in minor ways. So if that’s an open standard, then yes, that’s what we were trying to do.

In the late 1980’s, we formed a committee, and we invited all of the relevant researchers in the world, as at that stage the project was purely academic. There were no companies using lazy functional programming, or at least not many of them. We invited all of the researchers we knew who were working on basic functional programming to join in.

Most of the researchers we approached said yes; I think at that stage probably the only one who said no was David Turner, who had a language called Miranda, and Rinus Plasmeijer, who had a language called Clean. He was initially in the committee but he then dropped out. The committee was entirely by consensus – there wasn’t a mechanism whereby any one person decided who should be in and who should be out. Anybody who wanted could join.

How did the name come about?

We sat in a room which had a big blackboard where we all wrote down what we thought could be possible candidates for names. We then all crossed out the names that we didn’t like. By the time we were finished we didn’t have many!

Do you remember any of the names that you threw up there?

I’m sure there was Fun and Curry. Curry was Haskell Curry’s last name. He’d already given his name to a process called ‘currying’ and we ended up using Haskell instead of Curry, as we thought that there were too many jokes you could end up making about it!

So what made you settle on Haskell?

It was kind of a process of elimination really, and we liked that it was distinctively different. Paul Hudak went to see Curry’s widow who kindly gave us permission to use his name. The only disadvantage is that people can think you mean ‘Pascal’ rather than ‘Haskell’. It depends on the pronunciation – and it doesn’t take long to de-confuse people.

Did you come across any big problems in the early stages of development?

The Haskell project was meant to gather together a consensus that we thought existed about lazy functional programming languages. There weren’t any major issues about anything much, as we had previously agreed on the main issues and focus. There were also some things that we deliberately decided not to tackle: notably modules. Haskell has a basic module system but it’s not a state of the art module system.

Why did you decide not to tackle this?

Because it’s complicated and we wanted to solve one problem well, rather than three problems badly. We thought for the bits that weren’t the main focus, we’d do something straightforward that was known to work, even if it wasn’t as sophisticated as it could get. You only have so much brain capacity when you’re designing a language, and you have to use it – you only have so much oxygen to get to the top of the mountain. If you spend it on too many things, you don’t get to the top!

Page Break

Were the modules the main elements you decided not to tackle, or were there other elements you also avoided?

Another more minor thing was records. Haskell has a very simple record system, and there are lots of more complicated record systems about. It’s a rather complicated design space. Record systems are a place where there’s a lot of variation and it’s hard to know which is best. So again, we chose something simple.

People sometimes complain and say ‘I want to do this obviously sensible thing, and Haskell doesn’t let me do it.’ And we have to say, well, that was a place we chose not to invest effort in. It’s usually not fundamental however, in that you can get around it in some other way. So I’m not unhappy with that. It was the economic choice that we made.

So you still support these decisions now?

Yes. I think the record limitation would probably be the easiest thing to overcome now, but at this stage Haskell is so widely used that it would likely be rather difficult to add a complete record system. And there still isn’t an obvious winner! Even if you asked ‘today, what should records in Haskell look like?’, there still isn’t an obvious answer.

Do you think that new modules and record formats will ever be added on to the language?

You could certainly add an ML style module system to Haskell, and there have been a couple of papers about that in the research literature. It would make the whole language significantly more complicated, but you’d get some significant benefits from it. I think at the moment, depending on who you speak to, for some people it would be the most pressing issue with Haskell, whereas for others, it wouldn’t.

At the moment I don’t know anyone who’s actively working on a revision of Haskell with a full scale module implementation system.

Do you think that this is likely to happen?

I doubt it will be done by fitting it [new modules and record formats] on to Haskell. It might be done by a successor language to both ML and Haskell, however.

I believe that a substantial new thing like modules is unlikely. Because Haskell’s already quite complicated right now, adding a new complicated thing to an already complicated language is going to be hard work! And then people will say, ‘oh, so you implemented a module system on Haskell. Very well, what’s next?’ In terms of academic brownie points, you don’t get many unfortunately.

Page Break

In 2006 the process of finding a new standard to replace Haskell 1998 was begun. Where is this at now? What changes are being made?

Haskell ’98 is like a checkpoint, or a frozen language specification. So Haskell, itself, in various forms, has continued to evolve, but if you say Haskell ’98, everyone knows what you mean. If you say Haskell, you may mean a variety of things.

Why did the ’98 version get frozen in particular?

Because people had started saying that they wanted to write books about Haskell and that they wanted to teach it. We therefore decided to freeze a version that we could be relied on, and that compiler writers like me can guarantee to continue to maintain. So if you have a Haskell ’98 program it should still work in 10 years time.

When we decided to do it, Haskell ‘98 was what we decided to call it. Of course, 5 years later we may have done something different. That’s what’s happening now, as people are saying ‘I want to use language features that are not in Haskell ’98, but I also want the stability that comes from a ‘branded’ or kite marked language design – the kind that says this isn’t going to change and compilers will continue to support it.’

So it’s an informal standardization exercise again – there’s no international committees, there’s no formal voting. It’s not like a C++ standard which is a much more complicated thing.

The latest version is called Haskell Prime (Haskell’) at the moment. It’s not really a name, just a placeholder to say that we haven’t really thought of a name yet!

So how is Haskell Prime developing?

Designing a whole language specification, and formalizing it to a certain extent, or writing it down, is a lot of work. And at the moment I think we’re stalled on the fact that it’s not a high enough priority for enough people to do that work. So it’s moving rather slowly – that’s the bottom line.

I’m not very stressed out about that, however. I think that when we get to the point where people care enough about having a painstaking language design that they can rely on, then they’ll start to put more effort in and there’ll be an existing design process and a set of choices all laid out for them. I don’t see that [the current slow progress] as a failure; I see that as evidence of a lack of strong enough demand. Maybe what’s there is doing OK at the moment.

One way that this has come about, is that the compiler I am responsible for (the GHC or Glasgow Haskell Compiler), has become the de facto standard. There are lots of people using that, so if you use GHC then your program will work.

I don’t think that’s a good thing in principle, however, for a language to be defined by an implementation. Haskell is based whatever GHC accepts right now, but it [Haskell] should have an independent definition. So I would like to see Haskell Prime happen because I think it’s healthy to see an independent definition of the language rather than for it to be defined by a de facto standard of a particular compiler.

Do you think Haskell Prime will eventually reach that point?

I don’t know. It’s a question of whether the urgency for doing that rises before somebody comes out with something startlingly new that overtakes it by obsoleting. the whole language.

Have you seen anything out there that looks like doing this yet?

Not yet, no.

Are you expecting to?

It’s hard to say. In my experience, languages almost always come out of the blue. I vividly remember before Java arrived (I was still working on Haskell then), and I was thinking that you could never break C++’s strangle-hold on mainstream programming. And then Java arrived, and it broke C++’s strangle-hold!

When Java came, nobody provided commentary about this upcoming and promising language, it just kind of burst upon the scene. And Python has similarly become extremely popular, and Perl before it, without anybody strategically saying that this is going to be the next big thing. It just kind of arrived and lots of people started using it, much like Ruby on Rails. There are lots and lots of programming languages, and I’m no expert [on predicting what will be big next] . I don’t think anybody’s an expert on predicting what will become the next big thing.

So why am I saying that? Well, it’s because to supplant established languages, even in the functional programming area, like Haskell or ML or Scheme, you have to build a language that’s not only intriguing and interesting, and enables people to write programs faster, but you also need an implementation that can handle full scale applications and has lots of libraries and can handle profilers and debuggers and graphical analyzers… there’s a whole eco-system that goes with a programming language, and it’s jolly hard work building that up. What that means is that it’s quite difficult to supplant that existing sort of base.

I think if you thought about it in the abstract you probably could design a language with the features of Haskell and ML in a successor language, but it’s not clear that anybody’s going to do that, because they’d have to persuade all of those people who have got a big investment in the existing languages to jump ship. I don’t know when something fantastic enough to make people do that jumping will appear. I don’t think it’s happening yet, and I don’t think it’s likely to happen by somebody saying that ‘I’ve decided to do it!’ but rather more organically.

Page Break

Speaking of the evolution of Haskell, what do you think of variants such as Parallel, Eager and Distributed Haskell, and even Concurrent Clean?

This is all good stuff. This is what Haskell is for. Haskell specifically encourages diversity. By calling Haskell ’98 that name instead of ‘Haskell’, we leave the Haskell brand name free to be applied to lots of things. Anything that has Haskell in the name is usually pretty close; it’s usually an extension of Haskell ’98. I don’t know anything that’s called *something* Haskell and that doesn’t include Haskell ’98 at least.

These aren’t just random languages that happen to be branded, like JavaScript which has nothing to do with Java. They [JavaScript] just piggy-backed on the name. They thought if it was Java, it must be good!

Do you find yourself using any of these languages?

Yes. Concurrent Haskell is implemented in GHC, so if you say I’m using Concurrent Haskell you’re more or less saying you’re using GHC with the concurrency extension Data Parallel. Haskell is also being implemented in GHC, so many of these things are actually all implemented in the same compiler, and are all simultaneously usable. They’re not distinct implementations.

Distributed Haskell is a fork of GHC. Some older versions run on multiple computers connected only to the Internet. It started life as being only part of GHC, but you can’t use it at the same time as the concurrency extensions, or a lot of the new things that are new in GHC, because Distributed Haskell is a “fork”. It started life as the same code base but it has diverged since then. You can’t take all of the changes that have been made in GHC and apply them to the distributed branch of the fork – that wouldn’t work.

Concurrrent Clean on the other hand is completely different. It’s a completely different language; it’s got a completely different implementation. It existed before Haskell did and there’s a whole different team responsible, led by Rinus Plasmeijer. At one stage I hoped that we would be able to unify Haskell and Clean, but that didn’t happen. Clean’s a very interesting and good language. There’s lots of interesting stuff in there.

When did you think that the two might combine?

When we first started, most of us [the Haskell committee] had small prototype languages in which we hadn’t yet invested very much, so we were happy to give them all up to create one language. I think Rinus had more invested in Concurrent Clean, however, and so chose not to [participate in Haskell]. I have no qualms with that, as diversity is good and we don’t want a mono-culture, as then you don’t learn as much.

Clean has one completely distinct feature which is not in Haskell at all, which is called uniqueness typing. This is something that would have been quite difficult to merge into Haskell. So there was a good reason for keeping two languages… It’s another thing like modules that would have been a big change. We would have had a lot of ramifications and it’s not clear that it would have been possible to persuade all of the Haskell participants that the ramifications were worth paying for. It’s the do one thing well, again.

That sounds like the language’s aim: do one thing and do it well…

Yes. That’s right.

Page Break

We’re seeing an increase in distributed programming for things like multi-core CPUs and clusters. How do you feel Haskell is placed to deal with those changes?

I think Haskell in particular, but purely functional programming in general, is blazing the trail for this. I’ve got a whole one hour talk about the challenge of effect effects – which in this case actually means side effect effects. Side effects are things like doing input output, or just writing to a mutable memory location, or changing the value of the memory location.

In a normal language, like Pascal or Perl, what you do all the time is say ‘assign value 3 to x’, so if x had the value of 4 before, it has the value of 3 now. So that these locations, called x, y & z, are the names of a location that can contain a value that can change over time.

In a purely functional language, x is the name of a value that never changes. If you call a procedure in Perl or Pascal, you might not pass the procedure any arguments and you may not get any results, but the act of calling the procedure can cause it to write to disk, or to change the values of many other variables. If you look at this call to the procedure, it looks innocuous enough, but it has side effects that you don’t see. These aren’t visible in the call, but there are many effects of calling the procedure, which is why you called it. If there were no effects you wouldn’t call it at all.

In a functional language, if you call a function f, you give it some arguments and it returns a result. All it does is consume the arguments and delivers the result. That’s all it does – it doesn’t create any other side effects. And that makes it really good for parallel evaluation in a parallel machine. Say if you call f of x and then you add that result to g of y in a functional language, since f doesn’t have any side effects and g doesn’t have any side effects, you can evaluate the calls in parallel.

So the message is that purely functional programming is by default, safe for parallel programming, and mainstream programming is by default dangerous

But in a conventional mainstream programming language, f might have a side effect on a variable that g reads. f might write a variable behind the scenes that g looks at. It therefore makes a difference whether you call f and then g or g and then f. And you certainly can’t call them at the same time!

It’s actually really simple. If the functions that you call do not have any side effect effects behind the scenes, if all they do is compute a value from the input values that you give them, then if you have two such things, you can clearly do them both at the same time. And that’s purely functional programming. Mainstream languages are, by default, dangerous for parallel evaluation. And purely functional languages are by default fine at parallel evaluation.

Functional, whether lazy or non-lazy, means no side effect. It doesn’t mess about behind the scenes – it doesn’t launch the missiles, it doesn’t write to the disk. So the message of the presentation I mentioned before is that purely functional programming is by default, safe for parallel programming, and mainstream programming is by default dangerous.

Now, that’s not to say that you can’t make mainstream programming safer by being careful, and lots and lots of technology is devoted to doing just that. Either the programmer has to be careful, or is supported by the system in some way, but nevertheless you can move in the direction of allowing yourself to do parallel evaluation.

The direction that you move in is all about gaining more precise control about what side effects can take place. The reason I think functional programming languages have a lot to offer here is that they’re already sitting at the other end of the spectrum. If you have a mainstream programming language and you’re wanting to move in a purely functional direction, perhaps not all the way, you’re going to learn a lot from what happens in the purely functional world.

I think there’s a lot of fruitful opportunities for cross-fertilization. That’s why I think Haskell is well placed for this multi-core stuff, as I think people are increasingly going to look to languages like Haskell and say ‘oh, that’s where we can get some good ideas at least’, whether or not it’s the actual language or concrete syntax that they adopt.

All of that said however – it’s not to say that purely functional programming means parallelism without tears. You can’t just take a random functional program and feed it into a compiler and expect it to run in parallel. In fact it’s jolly difficult! Twenty years ago we thought we could, but it turned out to be very hard to do that for completely different reasons to side effects: rather to do with granularity. It’s very, very easy to get lots of very, very tiny parallel things to do, but then the overheads of doing all of those tiny little things overwhelm the benefits of going parallel.

I don’t want to appear to claim that functional programmers have parallel programming wrapped up – far from it! It’s just that I think there’s a lot to offer and the world will be moving in that direction one way or another.

You obviously had some foresight twenty years ago…

I don’t think it was that we were that clairvoyant – it was simply about doing one thing well…

Page Break

So would you have done anything different in the development of Haskell if you had the chance?

That’s a hard one. Of course we could have been cleverer, but even with retrospect, I’m not sure that I can see any major thing that I would have done differently.

And what’s the most interesting program you’ve seen written with Haskell?

That’s an interesting question. At first I was going to say GHC which is the compiler for Haskell. But I think the most interesting one, the one that really made me sit up and take notice, was Conal Elliot’s Functional Reactive Animation, called FRAN. He wrote this paper that burst upon the scene [at ICFP 1997].

What it allowed you to do is to describe graphical animations, so things like a bouncing ball. How do you make a ball bounce on the screen? One way to do it is to write a program that goes round a loop and every time it goes around the loop it figures out whether the ball should be one time step further on. It erases the old picture of the ball and draws a new picture. That’s the way most graphics are done one way or another, but it’s certainly hard to get right.

Another way to do it is, instead of repainting the screen, to say here is a value, and that value describes the position of the ball at any time. How can a value do that? Conal’s said ‘just give me a function, and the value I’ll produce will be a function from time to position. If I give you this function you can apply it at any old time and it will tell you where the ball is. So all this business of repainting the screen can be re-delegated to another piece of code, that just says I’m ready to repaint now, so let me reapply this function and that will give me a picture and I’ll draw that.’

In a lazy language, you evaluate expressions only when their value is actually required, not when you call a function - it’s call by need

So from a rather imperative notion of values that evolve over time, it turned it into a purely declarative idea of a value that describes the position of the ball at any time. Based on that simple idea Conal was able to describe lots of beautiful animations and ways of describing dynamics and things moving around and bouncing into one another in a very simple and beautiful way. And I had never thought of that. It expanded my idea of what a value might be.

What was surprising about it was that I didn’t expect that that could be done in that way at all, in fact I had never thought about it. Haskell the language had allowed Conal to think sophisticated thoughts and express them as a programmer, and I thought that was pretty cool. This actually happens quite a lot as Haskell is a very high level programming language, so people that think big thoughts can do big things in it.

Page Break

What do you mean when you call a language ‘lazy’?

Normally when you call a function, even in a call by value or strict functional programming language, you would evaluate the argument, and then you’d call the function. For example, once you called f on 3+4, your first step would be to evaluate 3+4 to make 7, then you’d call f and say you’re passing it 7.

In a lazy language, you don’t evaluate the 3+4 because f might ignore it, in which case all that work computing 3+4 was wasted. In a lazy language, you evaluate expressions only when their value is actually required, not when you call a function - it’s call by need. A lazy person waits until their manager says ‘I really need that report now’, whereas an eager will have it in their draw all done, but maybe their manager will never ask for it.

Lazy evaluation is about postponing tasks until you really have to do them. And that’s the thing that distinguishes Haskell from ML, or Scheme for example.

If you’re in a lazy language, it’s much more difficult to predict the order of evaluation. Will this thing be evaluated at all, and if so, when, is a tricky question to answer. So that makes it much more difficult to do input/output. Basically, in a functional language, you shouldn’t be doing input/output in an expression because input/output is a side effect.

In ML or Scheme, they say, ‘oh well, we’re functional most of the time, but for input/output we’ll be non-functional and we’ll let you do side effects and things that are allegedly functions.’ They’re not really functions however, as they have side effects. So you can call f and you can print something, or launch the missiles. In Haskell, if you call f, you can’t launch the missiles as it’s a function and it doesn’t have any side effects.

In theory, lazy evaluation means that you can’t take the ML or Scheme route of just saying ‘oh well, we’ll just allow you to do input/output side effects’, as you don’t know what order they’ll happen in. You wouldn’t know if you armed the missiles before launching them, or launched them before arming them.

I’m definitely very happy with using the lazy approach, as that’s what made Haskell what it is and kept it pure.

Because Haskell is lazy it meant that we were much more consistent about keeping the language pure. You could have a pure, strict, call by value language, but no one has managed to do that because the moment you have a strict call by value language, the temptation to add impurities (side effects) is overwhelming. So “laziness kept us pure” is the slogan!

Do you know of any other pure languages?

Miranda, designed by David Turner, which has a whole bunch of predecessor languages, several designed by David Turner - they’re all pure. Various subsets of Lisp are pure. But none widely used… oh, and Clean is pure(!). But for purely functional programming Haskell must be the brand leader.

Do you think that lazy languages have lots of advantages over non-lazy languages?

I think probably on balance yes, as laziness has lots of advantages. But it has some disadvantages too, so I think the case is a bit more nuanced there [than in the case of purity].

A lazy language has ways of stating ‘use call by value here’, and even if you were to say ‘oh, the language should be call by value strict’ (the opposite of lazy), you’d want ways to achieve laziness anyway. Any successor language [to Haskell] will have support for both strict and lazy functions. So the question then is: what’s the default, and how easy is it to get to these things? How do you mix them together? So it isn’t kind of a completely either/or situation any more. But on balance yes, I’m definitely very happy with using the lazy approach, as that’s what made Haskell what it is and kept it pure.

You sound very proud of Haskell’s purity.

That’s the thing. That’s what makes Haskell different. That’s what it’s about.

Page Break

Do you think Haskell has been successful in creating a standard for functional programming languages?

Yes, again not standard as in the ISO standard sense, but standard as a kind of benchmark or brand leader for pure functional languages. It’s definitely been successful in that. If someone asks, ‘tell me the name of a pure functional programming language’, you’d say Haskell. You could say Clean as well, but Clean is less widely used.

How do you respond to criticism of the language, such as this statement from Wikipedia: “While Haskell has many advanced features not found in many other programming languages, some of these features have been criticized for making the language too complex or difficult to understand. In addition, there are complaints stemming from the purity of Haskell and its theoretical roots.”

Partly it’s a matter of taste. Things that one person may find difficult to understand, another might not. But also it’s to do with doing one thing well again. Haskell does take kind of an extreme approach: the language is pure, and it has a pretty sophisticated type system too. We’ve used Haskell in effect as a laboratory for exploring advanced type system ideas. And that can make things complicated.

I think a good point is that Haskell is a laboratory: it’s a lab to explore ideas in. We intended it to be usable for real applications, and I think that it definitely is, and increasingly so. But it wasn’t intended as a product for a company that wanted to sell a language to absolutely as many programmers as possible, in which you might take more care to make the syntax look like C, and you might think again about introducing complex features as you don’t want to be confusing.

Haskell was definitely designed with programmers in mind, but it wasn’t designed for your average C++ programmer. It’s to do not with smartness but with familiarity; there’s a big mental rewiring process that happens when you switch from C++ or Perl to Haskell. And that comes just from being a purely functional language, not because it’s particularly complex. Any purely functional language requires you to make that switch.

If you’re to be a purely functional programming language, you have to put up with that pain. Whether it’s going to be the way of the future and everybody will do it – I don’t know. But I think it’s worth some of us exploring that. I feel quite unapologetic about saying that’s what Haskell is – if you don’t want to learn purely functional programming or it doesn’t feel comfortable to you or you don’t want to go through the pain of learning it, well, that’s a choice you can make. But it’s worth being clear about what you’re doing and trying to do it in a very clear and consistent and continuous way.

Haskell, at least with GHC, has become very complicated. The language has evolved to become increasingly complicated as people suggest features, and we add them, and they have to interact with other features. At some point, maybe it will become just too complicated for any mortal to keep their head around, and then perhaps it’s time for a new language – that’s the way that languages evolve.

Page Break

Do you think that any language has hit that point yet, whether Haskell, C++ etc?

I don’t know. C++ is also extremely complicated. But long lived languages that are extremely complicated also often have big bases of people who like them and are familiar with them and have lots of code written in them.

C++ isn’t going to die any time soon. I don’t think Haskell’s going to die any time soon either, so I think there’s a difficult job in balancing the complexity and saying ‘well, we’re not going to do any more, I declare that done now, because we don’t want it to get any more complicated’. People with a big existing investment in it then ask ‘oh, can you just do this’, and the “just do this” is partly to be useful to them, and also because that’s the way I do research.

There’s a big mental rewiring process that happens when you switch from C++ or Perl to Haskell. And that comes just from being a purely functional language, not because it’s particularly complex.

I’m sitting in a lab and people are saying ‘why don’t you do that?’, and I say ‘oh, that would be interesting to try so we find out.’ But by the time we’ve logged all changes in it’s very complicated, so I think there’s definite truth in that Wikipedia criticism.

And on a side note, what attracted you to Microsoft research? How has the move affected your Haskell work?

I’ve been working in universities for about 17 years, and then I moved to Microsoft. I enjoyed working at universities a lot, but Microsoft was an opportunity to do something different. I think it’s a good idea to have a change in your life every now and again. It was clearly going to be a change of content, but I enjoyed that change.

Microsoft has a very open attitude to research, and that’s one of those things I got very clear before we moved. They hire good people and pretty much turn them loose. I don’t get told what to do, so as far as my work on Haskell or GHC or research generally is concerned, the main change with moving to Microsoft was that I could do more of it, as I wasn’t teaching or going to meetings etc. And of course all of those things were losses in a way and the teaching had it’s own rewards.

Do you miss the teaching?

Well I don’t wake up on Monday morning and wish I was giving a lecture! So I guess [I miss it] in theoretical way and not in a proximate kind of way. I still get to supervise graduate students.

Microsoft have stuck true to their word. I also get new opportunities [that were not available to me at university], as I can speak to developers inside the [Microsoft] firewall about functional programming in general, and Haskell in particular, which I never could before. Microsoft are completely open about allowing me to study what I like and publish what I like, so it’s a very good research setup – it’s the only research lab I know like that. It’s fantastic – it’s like being on sabbatical, only all the time.

Page Break

Do you ever think the joke about Microsoft using Haskell as its standard language had come true? Haskell.NET?

Well, there are two answers to this one – the first would be of course, yes, that would be fantastic! I really think that functional programming has such a lot to offer the world.

As for the second, I don’t know if you know this, but Haskell has a sort of unofficial slogan: avoid success at all costs. I think I mentioned this at a talk I gave about Haskell a few years back and it’s become sort of a little saying. When you become too well known, or too widely used and too successful (and certainly being adopted by Microsoft means such a thing), suddenly you can’t change anything anymore. You get caught and spend ages talking about things that have nothing to do with the research side of things.

I’m primarily a programming language researcher, so the fact that Haskell has up to now been used for just university types has been ideal. Now it’s used a lot in industry but typically by people who are generally flexible, and they are a generally a self selected rather bright group. What that means is that we could change the language and they wouldn’t complain. Now, however, they’re starting to complain if their libraries don’t work, which means that we’re beginning to get caught in the trap of being too successful.

Haskell has a sort of unofficial slogan: avoid success at all costs

What I’m really trying to say is that the fact Haskell hasn’t become a real mainstream programming language, used by millions of developers, has allowed us to become much more nimble, and from a research point of view, that’s great. We have lots of users so we get lots of experience from them. What you want is to have a lot of users but not too many from a research point of view – hence the avoid success at all costs.

Now, but at the other extreme, it would be fantastic to be really, really successful and adopted by Microsoft. In fact you may know my colleague down the corridor, Don Syme, who designed a language: F#. F# is somewhere between Haskell and C# - it’s a Microsoft language, it’s clearly functional but it’s not pure and it’s defining goal is to be a .NET language. It therefore takes on lots of benefits and also design choices that cannot be changed from .NET. I think that’s a fantastic design point to be in and I’m absolutely delirious that Don’s done that, and that he’s been successfully turning it into a product – in some ways because it takes the heat off me, as now there is a functional language that is a Microsoft product!

So I’m free to research and do the moderate success thing. When you talk to Don [in a forthcoming interview in the A-Z of Programming Languages series], I think you will hear him say that he’s got a lot of inspiration from Haskell. Some ideas have come from Haskell into F#, and ideas can migrate much more easily than concrete syntax and implementation and nitty gritty design choices.

Haskell is used a lot for educational purposes. Are you happy with this, being a former lecturer, and why do you think this is?

Functional programming teaches you a different perspective on the whole enterprise of writing programs. I want every undergraduate to learn to write functional programs. Now if you’re going to do that, you have to choose if you are going to teach Haskell or ML or Clean. My preference would be Haskell, but I think the most important thing is that you should teach purely functional programming in some shape or form as it makes you a better imperative programmer. Even if you’re going to go back to C++, you’ll write better C++ if you become familiar with functional programming.

Have you personally taught Haskell to many students?

No, I haven’t actually! While I was at Glasgow I was exclusively engaged in first year teaching of Ada, because that was at the time in the first year language that Glasgow taught, and Glasgow took the attitude that each senior professor should teach first year students, as they’re the ones that need to be turned on and treated best. That’s the moment when you have the best chance of influencing them – are they even gong to take a second year course?

Did you enjoy teaching Ada?

Yes, it was a lot of fun. It’s all computer science and talking to 200 undergraduates about why computing is such fun is always exciting.

You’ve already touched on why you think all programmers should learn to write functional programs. Do you think functional programming should be taught at some stage in a programmer’s career, or it should be the first thing they learn?

I don’t know – I don’t actually have a very strong opinion on that. I think there are a lot of related factors, such as what the students will put up with! I think student motivation is very important, so teaching students a language they have heard of as their first language has a powerful motivational factor.

On the other hand, since students come with such diverse experiences (some of them have done heaps of programming and some of them have done none) teaching them a language which all of them aren’t familiar with can be a great leveler. So if I was in a university now I’d be arguing the case for teaching functional programming as a first year language, but I don’t think it’s a sort of unequivocal, “only an idiot would think anything else” kind of thing!

Page Break

Some say dealing with standard IO in Haskell doesn’t feel as ‘functional’ as some would expect. What’s your opinion?

Well it’s not functional – IO is a side effect as we discussed. IO ensures the launching of the missiles: do it now and do it in this order. IO means that it needs to be done in a particular order, so you say do this and then do that and you are mutating the state of the world. It clearly is a side effect to launch missiles so there’s no two ways about it.

If you have a purely functional program, then in principle, all it can do is take a value and deliver a value as its result. When Haskell was first born, all it would do is consume a character string and produce a character string. Then we thought, ‘oh, that’s not very cool, how can we launch missiles with that?’ Then we thought, ‘ah, maybe instead of a string, we could produce a list of commands that tell the outside world to launch the missiles and write to the disk.’ So that could be the result value. We’d still produced a value – that was the list of commands, but somebody else was doing the side effects as it were, so we were still holy and pure!

Then the next challenge was to producing value that said read a file and to get the contents of the file into the program. But we wrote a way of doing that, but it always felt a bit unsatisfactory to me, and that pushed us to come up with the idea of monads. Monads provided the way we embody IO into Haskell; it’s a very general idea that allows you to have a functional program that still includes side effects. I’ve been saying that purely functional programming means no effects, but programming with monads allows you to mix bits of program that do effect and bits that are pure without getting to two mixed up. So it allows you to not be one or the other.

But then, to answer your question, IO using monads still doesn’t look like purely functional programming, and it shouldn’t because it isn’t. It’s Monadic programming, which is kept nicely separate and integrates beautifully with the functional part. So I suppose it’s correct to say that it doesn’t feel functional because it isn’t, and shouldn’t be.

What Haskell has given to the world, besides a laboratory to explore ideas in, is this monadic idea. We were stuck not being able to do IO well for quite a while. F# essentially has monads, even though it’s an impure language, and so could do side effects. Nevertheless Don has imported into F# something he calls workflows, which are just a flimsy friendly name for monads. This is because even though F# is impure, monads are an idea that’s useful in their own right. Necessity was the mother of invention.

So monads would be Haskell’s lasting legacy in your eyes?

Yes, monads are a big deal. The idea that you can make something really practically useful for large scale applications out of a simple consistent idea is purely functional programming. I think that is a big thing that Haskell’s done – sticking to our guns on that is the thing we’re happiest about really.

One of the joys of being a researcher rather than somebody who’s got to sell a product is that you can stick to your guns, and Microsoft have allowed me to do that.

Page Break

What do you think Haskell’s future will look like?

I don’t know. My guess is that the language, as it is used by a lot of people, will continue to evolve in a gentle sort of way.

The main thing I’m working on right now is parallelism, multicores in particular, and I’m working with some colleagues in Australia at the University of NSW. I’m working very closely with them on something called nested data parallelism.

The compiler shakes the program about a great deal and produces a program that’s easy for the computer to run. So it transforms from a program that’s easy to write into a program that’s easy to run.

We’ve got various forms of parallelism in various forms of Haskell already, but I’m not content with any of them. I think that nested data parallelism is my best hope for being able to really employ tens or hundreds of processes rather than a handful. And nested data parallelism relies absolutely on being within a functional programming language. You simply couldn’t do it in an imperative language.

And how far along that track are you? Are you having some success?

Yes, we are having some success. It’s complicated to do and there’s real research risk about it – we might not even succeed. But if you’re sure you’re going to succeed it’s probably not research! We’ve been working on this for a couple of years. We should have prototypes that other people can work on within a few months, but it will be another couple of years before we know if it really works well or not.

I am quite hopeful about it – it’s a pretty radical piece of compiler technology. It allows you to write programs in a way that’s much easier for a programmer to write then conventional parallel programming. The compiler shakes the program about a great deal and produces a program that’s easy for the computer to run. So it transforms from a program that’s easy to write into a program that’s easy to run. That’s the way to think of it. The transformation is pretty radical – there’s a lot to do and if you don’t do it right, the program will work but it will run much more slowly than it should, and the whole point is to go fast. I think it’s [purely-functional programming] the only current chance to do this radical program transformation.

In the longer term, if you ask where Haskell is going, I think it will be in ideas, and ultimately in informing other language design. I don’t know if Haskell will ever become mainstream like Java, C++ or C# are. I would be perfectly content if even the ideas in Haskell became mainstream. I think this is a more realistic goal – there are so many factors involved in widespread language adoption – ideas are ultimately more durable than implementations.

Page Break

So what are you proudest of in terms of the languages development and use?

I think you can probably guess by now! Sticking to purity, the invention of monads and type classes. We haven’t even talked about type classes yet. I think Haskell’s types system, which started with an innovation called type classes, has proved extremely influential and successful. It’s one distinctive achievement that was in Haskell since the very beginning. But even since then, Haskell has proved to be an excellent type system laboratory. Haskell has lots of type system features that no other language has. I’m still working on further development of this, and I’m pretty proud about that.

And where do you think computer languages will be heading in the next 5 – 20 years or so? Can you see any big trends etc?

It’s back to effects. I don’t know where programming in general will go, but I think that over the next 10 years, at that sort of timescale, we’ll see mainstream programming becoming much more careful about effect – or side effects. That’s my sole language trend that I’ll forecast. And of course, even that’s a guess, I’m crystal ball gazing.

It’s no good just reading a book, you’ve got to write a purely functional program.

Specifically, I think languages will grow pure or pure-ish subsets. There will be chunks of the language, even in the main imperative languages, that will be chunks that are pure.

Given all of your experience, what advice do you have for students or up and coming programmers?

Learn a wide range of programming languages, and in particular learn a functional language. Make sure that your education includes not just reading a book, but actually writing some functional programs, as it changes the way you think about the whole enterprise of programming. It’s like if you can ski but you’ve never snowboarded: you hop on a snowboard and you fall off immediately. You initially think humans can’t do this, but once you learn to snowboard it’s a different way of doing the same thing. It’s the same with programming languages, and that radically shifted perspective will make you a better programmer, no matter what style of programming you spend most of your time doing. It’s no good just reading a book, you’ve got to write a purely functional program. It’s not good reading a book about snow boarding – you have to do it and fall off a lot before you train your body to understand what’s going on.

Thanks for taking the time to chat to me today. Is there anything else you’d like to add?

I’ll add one other thing. Another distinctive feature of Haskell is that it has a very nice community. We haven’t talked about the community at all, but Haskell has an extremely friendly sort of ecosystem growing up around it. There’s a mailing list that people are extremely helpful on, it has a wiki that is maintained by the community and it has an IRC channel that hundreds of people are on being helpful. People often comment that it seems to be an unusually friendly place, compared to experiences they’ve had elsewhere (and I can’t be specific about this as I genuinely don’t know.) I don’t know how to attribute this, but I’m very pleased that the Haskell community has this reputation as being a friendly and welcoming place that’s helpful too. It’s an unusually healthy community and I really like that.