chaource (chaource) wrote,

Why functional programming is almost dead (and has always been)

Is LISP dead? -- Well, it doesn't look any deader than usual to me. — David Thornley, reply to a question older than most programming languages

Programming is broadly divided into imperative and functional styles. In the imperative style we have variables that represent locations in memory. Variables have values, and we can change those values. So most of the time we manipulate the state of objects stored in memory, make new objects etc. Eventually the state of some objects stored in memory becomes what we want to compute, and that's the result. The oldest (and still alive) imperative language is Fortran.

In the functional style, variables (as a rule) are immutable; they can be assigned once and that's it. So variables are not locations in memory but rather labels set for convenience on some constant expressions. (In some functional languages, there are mutable variables too; but these languages are called "impure" functional languages.) In a functional language, most of the time we manipulate functions and values, rather than states of objects. We make new functions from old ones, or recursively transform one function call into another until we get the right function call at the end. The only place where something changes is the arguments of functions and the functions themselves. The result of a computation is a single object (which is, perhaps, a function rather than a number). The oldest (and still alive) functional language is Lisp.

There has been a lot of debate about which programming style is "better". However I haven't ever seen any company advertising a job that needs proficiency in Lisp or in any other functional language (Scheme, OCAML, Haskell, ...). Instead companies want skills in Java, C++, and other imperative languages. Why is that?


An example program to compute the average of a list of N numbers:

Imperative style (C):

float average(int N, float* given_list)
{
  int sigma = 0, i = 0;
  for (; i < N; i++)
  {
    sigma += given_list[i]; 
  } 
  return (sigma/N);
}


This program will initialize the counter i and the result accumulator sigma, then update sigma as it goes through the first N elements of the given array. Note that there is no easy way to determine the length of the array given the array pointer; so the length N is passed explicitly and the program has no way to determine whether the value N is correct. The program crashes if N is out of bounds.

Pure functional style (OCAML):

let average (N:int) (given_list:float list) = float_sum(list, 0) /. float_of_int(N)

let rec float_sum(given_list:float list, result:float) = match given_list with
 | [] -> result 
 | first :: tail -> float_sum(tail, first +. result)


This program computes the sum of all the elements in the given list and divides by N. The auxiliary function float_sum(l, result) returns (result + sum of the list l). The argument labeled result here plays the role of the accumulator sigma in the above C code. It is initialized in the first (and only) call of float_sum. Care has been taken to write the function float_sum in a tail-recursive style, so that the compiler can optimize this code. A non-tail-recursive style would be

let rec float_sum(given_list) = match given_list with
  | [] -> 0
  | first :: tail -> first + float_sum(tail)


The program does not check that given_list actually has length N. It is possible to make the program do so, at the expense of further complicating the auxiliary function float_sum. But my head already hurts trying to write this much in a pure functional style with correct tail recursion, using the insane syntax of OCAML.

Proponents of Fortran say that Fortran is easier to use, and that you can't get good performance, good parallel code on supercomputers, fast matrix multiplication, etc., without Fortran. Other imperative languages compete in the compiler features, portability, richness of libraries, syntax conventions, facilities for rapid GUI prototyping, and development environments. Proponents of Lisp and other functional languages say that functional languages are much more powerful than any imperative language, that functional languages have solid mathematical foundation, and that good optimizing compilers can give you the same performance as imperative languages. There is an outstanding example of the functional language SISAL that was designed specifically for fast numerics and for producing automatically parallelizable code for arbitrary supercomputers. SISAL outperformed Fortran (by factor 6 or so) on a Cray parallel computer. Nevertheless, SISAL never got any use outside the lab (LLNL), and in a few years the LLNL project group on SISAL was cancelled. Today SISAL is so dead that I couldn't even find a working link for the language manual (despite the presence of a sisal.sf.net project). A language that seems to be more alive is OCAML. That language also has provisions for optimized numerics and gets occasionally used, mostly by researchers. The degree of aliveness can be judged by the number of books written about the language. There are a total of three books on OCAML in existence.

It seems that functional programming is somehow so unattractive that no amount of technical merit can make people actually start using Lisp or any other functional language. Entire projects and corporations can be persuaded to switch from Fortran to C++, from C++ to Java, and from Java to .NET, but not to a functional programming language.

This is puzzling. I would like to understand this phenomenon. My short answer so far: Functional style is inherently difficult and goes against people's intuition about the real world and about calculations.

Programming language needs to be designed for people's understanding and not only for specific tasks and/or computer architectures. A very powerful language such as OCAML allows you to write very short code (they call it "expressive" code) but requires a lot of thought and concentration to write that code so that it works at all. (When it works, it usually works correctly.) There are perhaps too few programmers who are able to use OCAML (or Lisp) effectively, compared with a large number of programmers who have studied Pascal or Basic at school and subsequently C++ or Java at college. Functional languages are perhaps too complicated for the kind of workforce we have.

Why is C++ or Java easier? I think it's the imperative style that is easier. Yes, you can pass a function as an object in C++ or Java when it suits you, but most of the time you program in the imperative style because the language invites you to do so and makes purely functional style much more difficult.

It is easier to learn programming in the imperative style a direct manipulation of state accords with people's intuition. It's easy to imagine a variable as a box with a number in it (or perhaps with an object in it). It's easy to build a complete and adequate mental model for what is going to happen with the box when the computer runs your program; you just go through your program statement by statement and update the contents of the boxes.

The fact that a Fortran program becomes unmanageable for very large projects (regardless of the degree of object-orientedness, structured-ness, use of libraries, etc.) may be apparent but will not deter Fortran use until it's too late. This is similar to the fact that people will eat cakes despite knowing that it will possibly ruin their health.

Why is imperative style unmanageable? Unmanageable means that the functioning of the program as a whole becomes difficult or impossible for humans to understand in all detail. Understanding is roughly equivalent to being able to produce an (informal) proof of correctness for the program. Computer scientists have stidied formal languages and found that it is very difficult to prove correctness of programs in general, but this task becomes much easier for purely functional programs. This is so because manipulation of values in memory cannot be converted into a concise algebraic description of the logic of the program, but evaluation of constant expressions is equivalent to some reasonably straightforward algebraic constructions (lambda-calculus). When proving correctness of a program, it is helpful to construct invariants, that is, conditions that remain valid in a given place in the program. In the imperative style, a given place in the program can be visited many times (in a loop, or in different function calls), and then invariants are frequently broken because many memory locations have changed. This is the basic reason why it is so difficult to make sure a complicated program works correctly if one writes in the imperative style.

In the pure functional style, any expression is constant and so can be computed once and for all; it does not matter in which order the expressions are computed. So an invariant at a given place in the program is only a single condition to be checked, namely that a certain expression (made of constants) has a desired value. Also, optimization is much easier if everything is a constant. The compiler can then decide when to copy a object to a new memory location and when to keep the initial object and simply pass a pointer. The programmer never has to think about pointers, memory allocation, or out-of-bounds errors. In principle, the correctness of programs can be verified automatically (at least in some simpler languages).

However, there seems to be a hefty price to pay. The "constants" are not just numbers; they are sometimes quite complicated objects, e.g. functions that take a number of other functions and return objects containing other functions of the same kind. The programmer has to build an adequate and complete mental picture for a rather intricate computational environment, say, of OCAML. The computer is not anymore a simple collection of memory cells with numbers. OCAML (and most other functional languages) has many kinds of complicated objects not reducible to numbers; for example, polymorphic functors - functions that construct new functions out of old functions using complicated type-dependent rules. The programmer cannot simply say that "everything is an expression": the human brain does not work like that. To reason effectively about a program, the programmer will have to imagine each kind of OCAML object as a separate beast with different properties. The programmer will not be productive without knowing exactly how the language compiler will handle each particular (quite complicated) language construction (parameterized function types, partially parameterized function types, recursive functions, nonrecursive functions, lazy-evaluated functions, partially-matched function definitions, etc.).

For instance, I know that tail recursion is desirable even though it's sometimes more difficult to achieve (as in the float_sum example above) than non-tail recursion. My next thought is that I would like to compute the length of the list without traversing the list twice, because I know that it's not optimal and should be avoided (but only if the particular language implementation does not provide a O(1) function to get the length of the list!). I need to think about how to accomplish this new task, again with only tail-recursive calls. Well, OCAML supports also the imperative style - why not use it just for this? Maybe there is a reason to use the pure functional style; maybe the code will be parallelizable in that case; but I don't know that. The language does not tell me about that, but I need to know that to be productive.

How many other such tricks will I need, and how many other tricky situations do I have to learn about in order to produce efficient programs in OCAML? (I can just feel how I am losing interest for OCAML rather quickly...) In an imperative language such as Fortran or C++, the connection between the language and the machine is tight, and I know how to choose an efficient implementation of an algorithm, both in terms of memory and in terms of speed. In a functional language I can only guess which of the many possibilities is efficient. If the imperative-style function is as efficient as pure functional style, I will use the imperative style because it's easier. Maybe it would have been just as easy for me, had I studied Lisp at school instead of Pascal. But I doubt that learning about a very powerful and complex language such as Lisp is easier than learning about an easy language such as Pascal.

In fact, I learned about Lisp and Prolog at about the same time as I was going through PL/I and Pascal tortures at school. I was not impressed by any of these languages, functional or not. They all appeared equally artificial and mysterious because I didn't get an adequate mental model of their implementation. For me, a programmable calculator was fully understandable and hence useful; other programming languages were just for show. But maybe it's been different for other people.

So it seems that functional programming has largely failed to attract significant interest because it's hard. It's hard in the same way as abstract mathematics is hard, because functional programming emphasizes the same things as abstract mathematics: immutable, permanent functional relationships between objects, rather than the numerical data contained in the objects. Mathematics not only emphasizes deep logical connections between things, connections that are often counterintuitive and not at all apparent to superficial perception. Mathematics actually requires one to think only in terms of those abstract and counterintuitive connections. Mathematics requires you to develop a very specific mental capacity for abstract relationships and a specific, dynamically trainable intuition for those counterintuitive things. This is a formal kind of intuition that trusts a formalism more than it trusts a gut feeling, and such that the gut feeling is prepared to change when the formalism dictates so. Only very few people are mathematicians. Similarly, only very few people are comfortable with functional programming style and use it. (It seems that the people especially productive with a functional language are precisely the people who have implemented it and hence have a more or less complete mental picture of its inner workings.) Sure enough, those people are productive and get great results. But you can't have millions of college graduates with that kind of skills. Nobody knows how to teach those skills. MIT requires CoSci students to learn Scheme in its 6.001 course, but this does not seem to produce a massive number of students who are productive using functional languages.

That's why software companies list Java and C++ as requirements but not Lisp or OCAML. If there were some way of producing millions of abstract thinkers who can learn to be productive in a purely functional style with a minimum of imperative style, surely there would have been a job market for those people. Functional languages are perhaps more powerful than imperative-style languages, but functional languages are not better languages because functional languages are not suitable for ordinary people to learn and use.
Tags: computer, essay
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

  • 33 comments