May 28, 2009

Functional Programming is Not All or Nothing

I’ve noticed a flurry of discussion around the blogosphere lately about what is and is not a functional programming language and what is or is not stateful. After being interested in functional programming (FP) for many years, I’m glad that it is finally getting some mainstream attention. I think FP has some valuable lessons for all programmers, and that it provides a better default mindset for programming than traditional imperative programming.

However, a lot of the discussion seems to make the mistake of assuming that functional programming and statefulness are all-or-nothing properties, that either you have a total purity of statelessness or you have opened Pandora’s Box and let all the evils of state into your program or programming language. But the truth is that functional programming is best viewed as a style of programming with different default assumptions about state. And statefulness is best viewed as a continuum that varies along at least two dimensions — locality and monotonicity — both of which I will explain shortly.

What is state?


Before I do that, I had best describe in simple terms what state is. The easiest way to understand state is to imagine that I have given you a distinctive-looking box. If each time you open the box you find exactly the same contents, then the box can be said to be stateless. (This is also known as referential transparency). If each time you open the box the contents vary — the box could be empty sometimes, but full later, it could have different items in it or the same items with some new ones added — then it is stateful.

As you can see from this description, most boxes that we use in the real world are stateful, since it is very unusual for their contents to remain the same throughout their lifetime. However, if you are doing something complicated, say keeping financial records from past years in banker’s boxes, you might impose a discipline on your boxes, such as designating a particular box to contain only those financial records from April 2002, so that you can reliably find the records for a given month when you are looking for them.

Two styles of programming


Programming is like this as well. “Normal” imperative programming uses variables that can have changing values throughout the life of the program, which matches our understanding of boxes in the real world. FP takes the banker’s view of boxes. Since programs are very complex, and you want to be assured of finding what you expect, then the default assumption is that the contents of boxes will be the same throughout the life of the program.

So the key difference between a functional and an imperative style of programming is not whether there is any state at all in the program but where you start from. With imperative programming you start by assuming total changeability of all values you work with and have to add disciplines to get that state under control, whereas with FP you start from a place of statelessness and strategically loosen the reins in small amounts when you need to. The great strength of the FP approach is that it is much easier to maintain control by letting the genie out of the bottle slowly than by starting with the genie out of the bottle and trying to jam it back in.

Locality of State


The first dimension of state I want to discuss, locality, should be the easiest for programmers to understand but, for some reason, often causes confusion. Locality for state is just like scope for names in that it ranges from totally private to global, depending on how widely in your program it can be seen. A well-known property that makes state more local is encapsulation, when it is used to limit the visibility of a particular bit of state to a small part of the program. For encapsulation to accomplish this, though, it must actually hide the effects of the state from the parts of the program outside its scope.

To explain this, it is helpful to consider a common objection to functional programming languages by people new to the subject: how can a stateless program be built on a computer, which is an inherently stateful machine? The answer is that so long as the state is completely localized “under the hood”, it isn’t state at the next level up.

If we go back to the box analogy, let’s say that you have a very high-tech box that has the same contents every time you open it, but that, instead of the contents just being there when the box is closed, it actually disassembles the item when you close it and re-assembles it when you open the box again. The box is stateless because its contents never change from your point of view, even though its contents change when you can’t see inside.

So you can reduce the state in your program by ensuring that its various components look to each other as if they are stateless, even if internally they are implemented with large amounts of state, say for efficiency.

Monotonicity of State


Monotonicity of state is a little bit more subtle and probably less consciously familiar to programmers. Monotonic is a fancy way to say that once you add something, you can’t take it away. Using our boxes, the simplest example of monotonic state would be if, when you first open a box, you might find either nothing or something, but, as soon as you open the box and find something there, you must find the same something in the box every time you look in it thereafter. Upon finding the box empty, you could stop what you are doing and wait for something to show up in the box, after which you could be sure that it was not going to change, or you could take the opportunity to fill the box with something you want to fill it with, confident that someone else won’t be able to change it later. This is how dataflow variables work.

A more complex example of monotonic state would be a stream.
Once you have asked for the first value from the stream, it is as if you have filled the first box in the sequence, even if the box for the next item in the stream might still be empty.

With non-monotonic state, items can be added and taken away from the box at will, so that the box might have one thing in it the first time you open it, nothing in it the second time, and something totally new the third time. This is the kind of state that we normally think about when we talk about default imperative statefulness.

You can see that, though monotonic state is still state, it is much more predictable and manageable then full non-monotonic state, since you can always tell how much has changed from the last time you checked and you can be assured that what you found before will still be there. Furthermore, there is an explicit order to the changes in state that allows you to understand the history of operations, and thereby to make better automated decisions about what to do.

This is why the kind of message-passing concurrency, such as is found in Erlang is less stateful and relatively safer and more scalable than full shared-state concurrency, since an asynchronous message queue can be thought of as a monotonically-stateful stream.

Conclusion


While it would be much easier to talk about functional programming if statefulness were a nice black and white property that could be assigned to a particular language or program, I hope it is clear now that things are not quite so simple, and that degrees of state and “functionalness” could be present in any language or program. Obviously some languages and programs are going to make reduced state the default, and thus easier to implement, but the functional perspective can be applied to any language or program.

And making use of this perspective will help tame the complexity of software, especially in the face of concurrency.

May 13, 2009

Metaprogramming still needs programming discipline

This post will be a little bit more technical than previous posts, but I’ll try to keep it comprehensible to those who don’t already know what I’m talking about.

I want to talk about meta-programming. Meta-programming, in the most general sense, means making programs that produce programs or that change what existing programs do by altering the environment in which they operate. Writing an interpreter or a compiler for a programming language can be considered an example of meta-programming. Macros, monkey-patching and code generation are also examples. I would argue that the XML “configuration” files that haunt many Java frameworks, such as Spring, Hibernate, Struts, etc., are a form of meta-programming.

Now meta-programming is a very important idea, and I would go so far as to say that everyone who is at all serious about programming should learn about and understand meta-programming, even if only at a basic level. Meta-programming is often considered an advanced topic, and there certainly are advanced forms of it and advanced ideas that fall under its domain, but I think that anyone who is smart enough to program at all is smart enough to understand and perform basic meta-programming.

Now, since meta-programming is very important, has deep things to say about computation, and is intellectually stimulating, many people find it very exciting and even beautiful.

I will confess: I am one of those people. I have spent a non-trivial chunk of my leisure time throughout my adult life studying the semantics of programming languages, and all the supporting theories and math. And though knowing this stuff has been directly and indirectly useful in my software development career, I really did it because I enjoy it, and because I think it is beautiful and exciting.

Having said all these great things about meta-programming, whenever I see someone using meta-programming in a project, I get queasy. And that is because people often forget that just because meta-programming seems to let you go beyond the rules of “mere” programming, it still requires all the same discipline you would apply to any other kind of programming. For example, you still need to remember source code discipline and the enforcement of locality principles, such as modularity, the single-responsibility principle, encapsulation, don’t-repeat-yourself (DRY), and many others.

Take DRY for example. I have seen many programmers who would never tolerate cut-and-paste boilerplate in the source code blithely create megabytes worth of XML “configuration” files, or use a source code generator to do the same thing, even if there might be more acceptable alternatives with a bit of creativity.

Moreover, I think there are two kinds of locality that meta-programming should have that wouldn’t apply to single-level programming. First, meta-programming level code should be modularized away from the programming level code, and second, any domain-specific-languages (DSLs) or language variants created by the meta-programming should be clearly demarcated from the “normal” programming language (in their own source files if possible).

The reason for this is simple. Imagine if I started writing this post by alternating between English and some other language, say French. Some sentences or phrases in one language, some in the other. First of all, any member of the audience who is not fluent in both will probably be lost immediately. For those who are bilingual, there are many words that are spelled the same or very similarly in both languages, but may not have the same meaning, either subtly or not so subtly. So even if you are fluent in both, aside from the difficulty I’m adding by making you code switch, I may also cause you to misinterpret what I’m saying with similar or ambiguous words.

Just as the user of a programming library only wants to have to think about the interface to that library and not have to understand the gory details of how it actually works, the consumers of meta-programming “enhancements” need to be insulated from having to understand the details of how it works under the hood. This is where non-local meta-programming, such as reckless global monkey-patching, can really mess things up.

So anyone who wants to keep meta-programming beautiful should use the same judgment, taste and discipline that an experienced programmer would apply to keep any other type of programming beautiful. Otherwise, it can morph into something very, very ugly.

May 3, 2009

Good to Great

I finally got around to reading the book Good to Great. I first heard about it in early 2002 in the introductory speech of the incoming president at an organization I was in the process of leaving. I was quite impressed with what he had to say about leadership and organizational greatness and made a note to read the book as soon as it came out in paperback. (Hardcovers use up too much shelf space, which is at a premium in my home, so I tend to avoid buying them.) Since it is still in hardcover all these years later, I gave up and borrowed a copy from my brother.

There is a lot of criticism of the book online (here, here and here), but I think most of it misses the point. It doesn’t matter if you are happy with the experimental model the book was based on, or if the companies profiled didn’t stay good stock picks forever after the book came out, or if the advice Collins gives boils down to “obvious” suggestions.

The best business books tell you in an organized and compelling way what you already know to be true from your own experience. No business book can predict the future or give you a sure fire recipe for success, and Good to Great doesn’t pretend to anyway, so why do the critics expect it to?

What I did find in this book was a well-written and thoughtful explanation of the only reliable strategy I know of to accomplish any undertaking: for a disciplined group of people to pursue a focused goal in a determined manner, while being willing to see failure.

Now one can say “Hey, that’s obvious!”. But unfortunately, the obvious is more often observed in the breech than in the practice. The path of least resistance in many organizations is to let ineffective or obstructive people stick around way too long, to let the day-to-day crises and tempests-in-a-teapot derail their long-term goals, and to rationalize or ignore failures.

If Good to Great manages to inspire people by reminding them to struggle against these tendencies and to set a higher standard toward greatness, I think it deserves its place on the best-seller list.

May 2, 2009

Dancing Monkeys

Many years ago, before I started my software development career, I had a job call-center representative for a national automobile association. This included a twelve-hour shift by myself on Sundays.

The Sunday shift was a bit of a grind. It could be a long, silent, boring wait for calls that never came, or if the weather was inclement somewhere on the other side of the country, it could be very busy handling a spate of calls all alone.

During the normal workweek, when there were a bunch of us working, we would pass the slow times doing paperwork. By Sunday, when I was often the only person in the building, the paperwork was usually all done, so to pass the time I took to listening to the radio. As soon as the phone rang I would shut the radio off and pick up. There was no doubt in my mind that I was doing my job fully and faithfully, in spite of the discomfort and difficulty of the job.

One Sunday, the Vice President of the company decided to come to the office to catch up on some work. At some point, he wandered back to the call-center where it was a slow day, and I was sitting there by myself listening to the radio. He may have greeted me, or asked me how busy I was, but he didn’t stay long and he didn’t say anything significant.

However, on Monday I heard from my boss that he had complained that I was listening to the radio instead of working.

This was my first significant experience of a phenomenon that I call “Dancing Monkeys”: the tendency of arms-length leadership to prefer situations where everyone “looks busy” or is “hopping to”, even when the expected effort would have no additional effect or might even be counterproductive.

There are many reasons for this phenomenon some pernicious and some benign. The pernicious ones that we will all have seen at some time or another is pure ego-tripping: “I’m the top monkey here, so you lesser monkeys start dancing!”

A common, more benign reason is lack of understanding of the work domain under observation. Software development is particularly prone to this one.
For example, more than once I’ve heard some executive complain that he stopped by the development team area and no one was typing, with the implication being that no work was getting done, since as every non-technical person knows programming is all about lines per minute of code typed.

Another common scenario is that inevitable day when, under some unforeseen deadline pressure, some executive asks the development manager when he is going to institute overtime to help meet the deadline. Because of course, as any non-programmer knows, there is no degradation to the quality of software development when the team is exhausted, since typing lines of code is a purely mechanical process.

To come back to my call-center job, effective execution of my duties required that when I got a run of long, stressful calls, I had the energy and focus needed to solve the clients’ problems efficiently and effectively. Anyone who has ever made a service call to a droning zombie service rep knows what happens when someone doing that job lets their energy reserve run down.

Listening to the radio helped me recharge between calls, kept my morale up on quiet days and improved my performance. Doing mindless, unnecessary busy-work would have sapped that energy and morale. Asking my boss to find work for me to do that was meaningful but would not interrupt my real work would have just increased her workload without increasing the efficiency or effectiveness of our department.

So that Vice President had to ask himself: was making himself feel better by getting a dancing monkey really in the best interest of his company?