Archive for the 'programming' Category

Further reading for “Functional Programming for Object-Oriented Programmers”

I’ve been asked for things to read after my “Functional Programming for Object-Oriented Programmers” course. Here are some, concentrating on the ones that are more fun to read and less in love with difficulty for difficulty’s sake.

If you were more interested in the Clojure/Lisp part of the course:

  • The Little Schemer and other books by the same authors. These teach some deep ideas even though they start out very simply. You’ll either find the way they’re written charming or annoying.

  • Land of Lisp is a strange book on Lisp.

  • If you were totally fascinated by multimethods, you could try The Art of the Metaobject Protocol.

If you were more interested in the functional programming part of the course, the suggestions are harder. So much of the writing on functional programming is overly math-fetishistic.

  • Learn You a Haskell for Great Good is another quirky book (lots of odd drawings that have little to do with the text), but it covers many topics using the purest functional language. There’s both a paper version and a free online version. The online version has better drawings (color!).

  • I haven’t read Purely Functional Data Structures, though I think I have it somewhere. @SamirTalwar tweeted to me that “I’m enjoying Purely Functional Data Structures. It’s quite mathsy, but with a practical bent.” The book is derived from the author’s PhD thesis.

The approach I took in the course was inspired by the famous Structure and Interpretation of Computer Programs (free online version). Many people find the book more difficult than most of the ones above. Part of the reason is that the problems are targeted at MIT engineering students, so you get a lot of problems that involve resistors and such.

Functional Programming for the Object-Oriented Programmer

Here is a draft blurb for a course I’d like to teach, possibly in Valladolid, Spain, in September. I’d like to see if enough people are interested in it, either there or somewhere else. (There’s a link to a survey after the description.)

Short version

  • Target audience: Programmers who know Java, C#, C++, or another object-oriented language. Those who know languages like Ruby, Python, or JavaScript will already know some—but not all!—of the material.

  • The goal: To make you a better object-oriented programmer by giving you hands-on experience with the ideas behind functional languages. The course will use Clojure and JavaScript for its examples. You’ll learn enough of Clojure to work through exercises, but this is not a “Learn Clojure” course.

  • After the course: I expect you to return to your job programming in Java or Ruby or whatever, not to switch to Clojure. To help put the ideas into practice, you’ll get a copy of Dean Wampler’s Functional Programming for Java Developers.

Syllabus

  • Just enough Clojure
  • Multimethods as a generalization of methods attached to objects
  • Protocols as a different approach to interfaces
  • Functions as objects, objects as functions
  • Mapping and folding in contrast to iteration
  • Functions that create functions
  • Laziness and infinite sequences
  • Time as a sequence of events
  • Generic datatypes with many functions versus many small classes
  • Immutable data structures and programs as truth-statements about functions
Interested?

Longer version

Many, many of the legendary programmers know many programming languages. What they know from one language helps them write better code in another one. But it’s not really the language that matters: adding knowledge of C# to your knowledge of Java doesn’t make you much better. The languages are too similar—they encourage you to look at problems in pretty much the same way. You need to know languages that conceptualize problems and solutions in substantially different ways.

Once upon a time, object-oriented programming was a radical departure from what most programmers knew. So learning it was both hard and mind-expanding. Nowadays, the OO style is the dominant one, so ambitious people need to proactively seek out different styles.

The functional programming style is nicely different from the OO style, but there are many interesting points of comparison between them. This course aims to teach you key elements of the functional style, helping you take them back to your OO language (probably Java or C#).

There’s a bit more, though: although the functional style has been around for many years, it’s recently become trendy, partly because language implementations keep improving, and partly because functional languages are better suited for solving the multicore problem* than are other languages. Some trends with a lot of excitement behind them wither, but others (like object-oriented programming) succeed immensely. If the functional style becomes commonplace, this course will position you to be on the leading edge of that wave.

There are many functional languages. There are arguments for learning the purest of them (Haskell, probably). But it’s also worthwhile to learn a slightly-less-pure language if there are more jobs available for it or more opportunities to fold it into your existing projects. According that standard, Clojure and Scala—both hosted on the JVM—stand out. This course will be taught mainly in Clojure. Where appropriate, I’ll also illustrate the ideas in JavaScript, a language whose good parts were strongly influenced by the functional style.

While this particular course will concentrate on what you can do in Clojure, it’d be unfair to make you do all the work of translating these ideas into your everyday language. To help you with that, every participant will get a copy of Dean Wampler’s Functional Programming for Java Developers, which shows how functional ideas work in Java.

Your instructor

Brian Marick was first exposed to the functional style in 1983, when the accident of knowing a little bit of Lisp tossed him into the job of technical lead on a project to port Common Lisp to a now-defunct computer architecture. That led him to a reading spree about all things Lisp, the language from which the functional style arguably originated. He’s been a language geek ever since, despite making most of his living as a software process consultant. He’s the author of the popular Midje testing library for Clojure and has written two books (Everyday Scripting with Ruby and Programming Cocoa with Ruby).

For examples of Marick’s teaching style, read his explanations of the Zipper and Enlive libraries or watch his videos on Top-down TDD in Clojure and Monads.

Interested?

 

*The multicore problem is that chip manufacturers have hit the wall making single CPUs faster. The days of steadily increasing clock rates are over. Cache sizes are no longer the bottleneck. The big gains from clever hardware tricks have been gotten. So we programmers are getting more cores instead of faster cores, and we have to figure out how to use multiple cores in a single program. That’s a tough problem.

Monad tutorial, Part 4

The State monad is a big jump for the monad-learner. It jumps up a level of abstraction by using functions as values. That makes it confusing because you have functions that work on functions, and it can be hard to keep track of whether a particular function is a this-kind-of-function or a that-kind-of-function. I try to tease apart the Gordian Knot thusly:

  1. I start by creating a logging monad that simply logs the value of each step. This is fairly straightforward.

  2. Next, I decide I want only particular steps to be logged. Steps to be logged use a log function which returns either a plain value or a wrapped value. That leads to an if statement in the “decider”.

  3. Then I raise the question: can we get rid of the if statement by pushing its work down into the log function? In this section of the tutorial, I do something like follow Beck’s rules of design: make a series of undirected local changes that arrive at something globally coherent. Specifically, that first simple decision of pushing work into one place forces us to implement the full State monad (though I only show it used for building a log).

  4. Finally, I use that solution to illustrate some key concepts: base values, monadic functions, monadic values, and how they’re put together to make up a monad.

I have my doubts about how well this works, particularly the 3d step. I’d value your opinion.

http://www.vimeo.com/21307543

Looking for contract work

As I mentioned earlier this year, I’m looking to make one of my decadal career shifts. Since that decision, I’ve been doing part-time contract work on a RubyCocoa application, and I’ve found it satisfying to deliver working software to people who are happy to get it. It’s also helped with the nagging dread that—while I can talk the talk about programming, testing, refactoring, and all that—I wouldn’t be able to walk the walk. It turns out I can. Although I’m slower than I’d like, I do respectable work.

In my ideal contract, I’d:

  • … code in Clojure, Ruby, or Javascript. Other than that, I don’t require super-advanced or cool technology, but I do have a hankering to work on something that could somehow be the inspiration for another book. I don’t care about the domain.

  • … devote 1/2 to 3/4 of my time to a single project, working with a single team, over a period of months. Some portion of that—a week or two a month—would be spent onsite. (Chicago would be the best place because it’s easily accessible by train. I live in Central Illinois.)

  • … work at a sustainable pace, and be given the leeway to do a good job by my standards. I’m trying to be artisanal about my code.

    (”I want to be artisanal” might raise red flags: will I decide I know what’s needed better than those who are paying for it? My saving grace is that I have a Labrador-like eagerness to please. I want product owners to smile when they think of me.)

  • … be able to stretch by occasionally going slower while I experiment with techniques. (My work on outside-in TDD in Clojure is an example.) I’m willing to be paid less in order to improve faster.

  • … be in frequent contact with the people who’ll viscerally appreciate the features they get for the money they spend. That given, I don’t care whether I am working directly for a product company or as a subcontractor on behalf of a contract programming company.

  • … work in an Agile style. (I almost didn’t think to include this, since I assume anyone interested in hiring me would expect or accept that. I’m not interested in a job teaching the glories of continuous integration or TDD or refactoring. I’m interesting in learning how to do them ever better, and in working with people who have the same interests.)

However, there may not be an ideal, and I don’t intend to be rigid about opportunities. I could see, for example, working with several teams at once, being someone who helps convert a daily grind into an exploration of new techniques. That’d be more like my consulting past, but I’d be more hands-on than I have in the past, involved for longer, and feel more responsible for the product.

Also: although I listed Chicago as my desired location, it has drawbacks when it comes to (1) winter and (2) helping me with my (currently somewhat faltering) attempt to learn Spanish. I wouldn’t mind working in Costa Rica, Argentina, or elsewhere in Latin America (probably for a longer continuous chunk of time onsite).

I don’t have a huge portfolio of code to show you. What I have is on Github. Critter4Us shows my Cappuccino and Ruby code. My Clojure code is limited to Midje, which is a programmer’s tool rather than an end-user project.

My email address is marick@exampler.com.

TDD in Clojure, part 3 (one wafer-thin function; conclusions)

Part 1
Part 2

All that remains is to add the locations of bordering cells to a given sequence of locations. A small wrinkle is that a border location may be next to more than one of the input locations, so duplicates need to be prevented. I could write the test this way:

I don’t like that. When the test fails, it’ll likely be hard to discover how the expected and actual values differ. And I bet that a failure would more likely be due to a typo in the expected value than to an actual bug. The test is just awkward.

I like this version better:

It is a clearer statement of the relationship between two concepts: a location’s neighborhood and the border of a set of locations. More pragmatically, it’s less typing. (When I first started coding in this style, it surprised me how much test setup clutter went away. I had much less need for “factories” or “fixtures” or “object mothers“.)

Given either test, the code is straightforward:

Onward

No matter how satisfying the individually-tested pieces, the whole has to work, which is why everything ends by running at least one end-to-end test. A test like the one we started with:

Because I’m making up my test notation as I go, I’ve been running all the tests manually in the REPL. Now I can run the whole file:

If you look carefully, you’ll see that the test would fail because the locations are in the wrong order. But that’s a quick fix:

Done. Ship it!

Development order - Test-down or REPL-up?

It’s often said that the Lisps are bottom-up languages, that you test out expressions in the REPL, discover good functions, and compose them into programs. A lot of people do work that way. A lot of people who use TDD to write object-oriented code also work that way: when implementing a new feature, they start at low-level objects, add whatever new code the top-level feature seems to demand of them, then use those augmented objects in the testing of next-higher-level objects.

For I guess about a year now, I’ve been experimenting with being strictly top-down in some projects. I find that leads to less churn. Too often, when I go bottom-up, I end up discovering that those low-level changes are not in fact what the feature needs, so I have to revisit and redo what I did.

I get less disrupted by rework when I go from the top down (or from user interface in). It’s not that I don’t blunder—we saw one of those in the previous installment—but those blunders seem easier to recover from.

That’s not to say that I don’t use the REPL. We saw that a little bit in this program, when I was writing neighbors. It’s perfectly sensible to do even more in the REPL. I think of the REPL as a handy tool for what XP calls a spike solution and the Pragmatic Programmers call tracer bullets. When I’m uncertain what to do next, the REPL is a tool to let me try out possibilities. So I might be stuck in a certain function and go to the REPL to see how it feels to build up parts of what might lie under it. After I’m more confident, I can continue on with the original function, test-driven, reusing REPL snippets when they seem useful.

I don’t claim everyone should work that way. I do claim it’s a valid style that you’d be wise to try.

Notation

I’m something of an obsessive about test notation, and I’ve been endlessly fiddling with a Clojure mock notation. I implemented its first version as a facade on top of clojure.contrib.mock. As I experimented, though, I found that keeping my facade up to date with my notational variants slowed me down too much, so I put the code aside until the notation settled down.

I’m pretty happy with what you’ve seen here. Are you? If so, I may start on another mock package. I’ve got the most important parts: a name and sketch of a logo. (”Midje” and someone flying safely between the sun [of abstraction without examples] and the sea [of overwhelming detail].)

Tests and code together

You can see the completed program here. It mixes up tests and code. I’ve tried that on-and-off over the years and always reverted to separate test files. This time, it’s seemed to work better. I probably want an Emacs keystroke that lets me hide all tests, though. I’d also want alternate definitions of the test macros so that I can compile them out of the production system

What next?

I’ll write a web app in this style, using Compojure.

TDD in Clojure, part 2 (in which I recover fairly gracefully from a stupid decision)

Part 1
Part 3

I ended Part 1 saying that my next step would be to implement a function that counts the number of living neighbors a cell has. Given that we’re already pretending (through stubbing) that a living? function exists, living-neighbor-count is pretty trivial if we also pretend we’ve got a neighbors function:

Following my “mapping, like accessors, is too simple to test” guideline, I almost didn’t write a test. But what the heck:

Once the test passes, we need to write neighbors. To implement it, we’re going to have to take cells apart (to get x and y coordinates) and put them back together (to create neighbors). So I don’t see any point to using stubs and dummy variables like ...cell... in this test:

Boldly, I will here use one test to define both cell-at and neighbors (as well as the test helper have-coordinates that checks a list of cells against a list of coordinates).

(If I were more sensitive to that small voice in my head that warns I’m going astray, I would have heard something around now, but I ignored it. So we will too.)

Enter the REPL

My thought about how to implement neighbors has three steps, so I’ll try them out in the REPL. First, I’ll make (x,y) pairs to add and subtract from the original cell’s coordinates:

That’s good, except (0, 0) shouldn’t be in there. (A cell can’t be its own neighbor.) So I need to delete that:

(remove #{[0 0]} product) is a Clojure idiom. remove returns its second (sequence) argument, omitting any element that the first argument (a function) returns truthy for. #{x} is the set containing x. In Clojure, sets act as functions that return something truthy iff their single argument is in the set. That is:

Finally, I need a function that shifts a cell by an offset. For the REPL, I’ll pretend the cell is just an [x y] vector. (We have yet to define what it really is.)

I can build neighbors from what I’ve tried out. To make the test pass, I’ll continue to use vectors for cells, hiding them behind a simple functional interface of cell-at, x, and y.

The concrete representation of the cell — and disaster

Here are the functions as yet undefined:

There’s no more escaping it. I’m going to have to decide what kind of thing border produces. That thing has to be a sequence for tick to map over:

border's result is also stored in world where living? will use it to decide whether a given cell is alive or dead.

My first thought was that I could use the set idiom I used above—the bordered world could just be the set of all living coordinates. Sneakily, any location not in the set would represent a dead cell. That would be great for implementing living?, but it wouldn’t work for tick, which has to process not only living cells, but also the dead cells that make up the border.

So my fallback was for border to produce a map, something like this:

Maps are sequences, so you can map over them. But I don’t think I’ve ever actually tried it. What happens?…

OH GREAT. If I go down this route, we’ll have three different ways of representing cells:

  • as the original location in inputs like *vertical-blinker*: [0 1]
  • as part of a living/dead map: {... [0 1] :dead ...}
  • as a living/dead vector: [ [0 1] :dead ]

That’s intolerable. And yes, I bet at least half of my two readers thought I was mistaken not to think about data structures at the very beginning. However, my strategy with Clojure TDD has been to put off thinking about data structure as long as I can, and I’ve been surprised and pleased by how often I ended up with simpler data than it seemed I would. I’ve found that, given the use of globally-available immutable “background” data, much of what might have been explicit data structure–vectors of maps of vectors of…–ends up in the implicit structure of the computation. More about that, though, will have to wait for another post.

A recovery plan

The problem is here:

When I wrote that, I remember that the still small voice of conscience objected to the way I was both stashing the bordered-world away as background and simultaneously picking it apart with map. That just felt weird, but I argued myself into thinking it was harmless. It was not.

Really, since my whole program takes input [x y] pairs (such as *vertical-blinker*) and turns them into a different set of [x y] pairs, most of my work ought to be done with those pairs. I should be thinking about locations of cells, not cells themselves. In that way of thinking, border shouldn’t produce “cells”. It should take locations of living cells and produce locations that point to both living cells and adjacent dead cells.

Further, I shouldn’t repeat those locations in a world function. Instead, I need something that can answer questions about cells, given their locations. It should be a… (I’m bad with names)… an oracle about cells. I first imagined this:

using-cell-oracles-from should produce any wise and oracular functions we need. So far, that’s just living?.

I realized something more. Locations are flowing into the pipeline, locations are flowing out, and in this version, locations won’t be transformed into cells anywhere within the pipeline. That makes unborder, which was originally supposed to convert a mixture of living and dead cells into only living locations, seem kind of stupid. If tick produces only living locations, unborder can go away. (The name unborder always bugged me, because it didn’t really describe what the function would have to do. Once again, I should have paid attention.)

That leads to this top-level function:

That wasn’t so bad…

As it turns out, changing my mind about such a fundamental decision was easy.

What did I have to do to the code? I had to write using-cell-oracles-from. Here’s a test.

I won’t show the code that passes this test—it’s a somewhat grotty macro (but a simple transformation of the earlier against-background). You can see it in the complete source for this post.

I did a quick global-replace of “cell” with “location” and tweaked a couple of the resulting names. Although both you and I know that locations are just pairs, I retained the functions make-location (formerly cell-at), x, and y to keep the code insulated from the potential of another change of mind.

I had to convert the successor function to dead-in-next-generation?. That was pretty simple. I had to change two lines in the test. Here’s one:

To make that test pass, I had to rewrite successor. It used to be this:

Now it’s this:

That was just a matter of inverting the logic and deleting killed and vivified. (Before I ever got around to writing them!)

The ease of this change makes me happy. Even though I blundered at the very beginning of my design, the way stub-heavy TDD lets me defer decisions—and forces me to encapsulate them so that I have something to stub—made the blunder a not-catastrophe. I wish I could say that I blundered deliberately to demonstrate that property of this style of TDD, but that would be a lie.

Enough for today

Only one function remains: add-border-to. That’ll be pretty easy, but this post is already too long. The next one will finish up the implementation and add whatever grand summary I can come up with.

TDD in Clojure: a sketch (part 1)

Part 2

I continue to use little experiments to help me think through TDD in Clojure. (I plan to begin a realistic experiment soon.) Right now, I’m mainly focused on three questions:

  • What would mocking or stubbing mean in a strict(ish) functional language?

  • What’d be a good mocking notation for Clojure?

  • How do you balance the outside-in style associated with mocks and the bottom-up style that the REPL (interpreter) encourages?

Here’s an example from Conway’s Game of Life. It begins with an implementation suggestion from Paul Blair and Michael Nicholaides at the Philly Code Retreat. Instead of thinking of the board as a 2×2 array of cells, with some of them dead and some alive, think instead only of living cells, each of which knows its coordinates. Here’s an example that shows how “blinkers” blink from generation to generation.

A couple of things have happened here:

  • This is my notation for a straightforward non-stubbing test. The value on the left is executed and it’s compared (for equality) to the value on the right.

  • I’ve started coding outside-in, and I’ve named the first function I need: next-world.

The Blair/Nicholaides approach advances the “world” to the next generation by (conceptually) adding dead cells around the edge of all the living cells, running the normal life rules that govern how cells change because of their neighbors, and then throwing away all the cells that end up dead. In other words:

  • The pending bit is just there because (sadly) Clojure makes you declare functions before mentioning them. pending just creates functions that print that they’ve not yet been implemented.

  • The rest of the code flows the world argument through a pipeline of three functions. If you’re not familiar with the -> macro, the result is the same as this:

    I don’t feel the need to test this code now because it’s really declarative—it says what it means to produce a next world under this approach. (It will be tested in the very end by the “integration test” that shows a blinker working.)

I can now implement any of the three new functions. I’ll pick tick because it seems to be the heart of the matter. Here’s a first implementation:

There are two odd things going on here.

First, stubbing function calls.

In object-oriented languages, I think of mock-driven-design as a way of teasing out collaborators for the object I’m building. I push responsibilities for work onto objects that I’ll implement later. Mocking lets me defer the implementation of those objects until I’m ready, and creating some examples of the API teaches me the (implicit) specification for the new object.

I’ve found that with pure functional programs that don’t modify state, it makes more sense to think of a function like (f 2) => 4 as a fact. What I’m doing as I test-drive a function is describing how facts about its inputs and outputs depend on other facts, in an almost Prolog-like way. For example, consider this code:

That says that, for any cell you care to provide, f of that cell will be 10, provided g of that cell is true and h is 2. If either of those latter two facts don’t apply to the cell, I’m not saying what f’s value is.

I use the funny ...cell... notation in the way that mathematicians use n to talk about any integer. (They call that universal quantification.) I don’t want to create a particular cell because I might need to specify properties that have nothing to do with the function I’m working on. This notation says that nothing about the cell is relevant except for what comes after the provided.

Here’s one way to write a Life rule in this notation:

The falsey bit in the first line is because Clojure has two distinct values that can mean “false”. falsey is a function that takes the result of the left-hand side and fails the test if that result is anything other than one of the two false values. I’m using it because I don’t want to overspecify living?. There’s no reason to care which of the two “false” values it returns.

There’s a problem with this test, though. Remember what I said above: the left-hand side gets evaluated and handed to falsey. That means living? has to have a definition—which means I’d have to settle on how the code knows whether a cell is alive or dead. I like doing one thing at a time and putting off decisions as long as I can, and right now I’d rather be focused on successor instead of cell representations.

Here’s a way to defer that decision:

Here I’m saying something subtly different than before. I’m saying that the result of successor is specifically that cell produced by calling killed on the original cell. The =means=> notation tells the framework to create a mock instead of evaluating the right-hand side for its value. In a more familiar mocking syntax (for Ruby), the whole test is equivalent to:

OK. The next figure gives the whole set of Life rules, expressed as executable tests. (Well, executable as soon as I implement the testing framework.) Notice that I called the outer wrapper know (a fact) instead of example. know seems more appropriate for rules. The two forms mean the same thing.

Notice also that I implemented a notation for saying “run this test for each value in a sequence”. The use of commas, as in [4,,,8], indicates that—conceptually—the fact is true for all values four through eight. Only the ones listed are actually tried. (Commas count as >white space in Clojure.)

This isn’t the tersest possible format—a table would be better—but it’ll do. I think it’s reasonably readable. Do you?

Here, for reference, is code that passes the test:

We now have an expanded choice of functions to write:

I could go breadth-first—with border and unborder—or go depth-first with one of the functions on the second line. In this particular case, I’d rather go depth first. I’ve avoided deciding on a representation, so I don’t know yet what border should do.

If this installment meets your approval, I’ll add another one that begins work on—oh—probably living-neighbor-count is the most complicated, so it’s a good one to chip away at.

Old programmers

A correspondent writes:

How does one continue to build a career in software development, when there are younger, hungrier people (i.e.people who can, and will work 16-hour days and can learn things at a ridiculous pace) joining the field?

I’m at the ripe, old age of 33 and am already feeling like it’s a challenge to keep up with the 23-, 24-, 25-year-olds. :/

Also — and I know this is partly a function of the field’s explosive growth over the years — but I just don’t see that many software devs in their 40’s and up, so I don’t have much in the way of precedent to observe, in terms of a career path, other than going into management

Since I’ve pretty much decided to devote this next decade to programming, part of it for hire, this is an important topic to me. (I’m 50.) I don’t know that I have much useful to say, though. Nevertheless…

  • In a team, some people serve as catalysts for other people’s abilities. For example, ever since my 20’s, I’ve been hung up on friction-free work. So I was more likely than other people to make the build work better, write and share emacs functions to automate semi-frequent tasks, or to work on testing. Those are not glory tasks—”I’m a rock star build-fixer!”—but they help the team. As the team’s codger, I might emphasize that bent even more, freeing the team’s whippersnappers to concentrate on the most prodigious feats of coding.

    A typical way in which an older programmer can catalyze is by paying attention to human relationships. If you can avoid the damnable tendency old people have to pontificate at people or tell only marginally relevant stories about how they did it 20 years ago, you can be a person who “jiggles” teams into a better configuration. (Term is due to Weinberg’s Secrets of Consulting, I think.) The ideal role is that of player coach (but one gradually recognized by the team, rather than appointed for it.)

  • Another reason for the patriarch to work in a tight-knit team is that a young programmer’s advantages are not uniform. For example, what makes me feel most like a doddering oldster is the sheer amount of stuff kids these days know about: tool upon tool upon tool, gem upon gem upon gem, the 83 new things in the newest point release of Rails, etc. But if you have one of those people on the team, the advantage accrues to everyone, and the centenarian’s loss of a pack-rat mind is not such a disadvantage.

    When what matters is the team’s capability, balance is more important than each individual’s uniform excellence. So when fogy-dom looms, focus on being complementary rather than unique.

  • A traditional way for the older programmer to cope is by being one of the dwindling number of experts in a has-been technology (Cobol being an example, Smalltalk being another). That technology doesn’t necessarily have to be boring. Sometimes, as has sort of happened with Smalltalk, Lisp, and maybe the purer functional languages, the has-been becomes hot again.

    A perhaps-related route is to become an expert in a very specialized and difficult technology like, say, virtual machines or security–something that’s difficult to pick up quickly and requires continuous learning.

  • Now that we’ve learned that legacy code doesn’t have to suck, perhaps the graybeard should angle to attach himself to a particular large and long-lived code base. There could be a lot of pleasure in watching your well-tended garden improve year after year.

  • It’s also useful not to act old. For example, I have to fight the urge to be sort of smug about not knowing CSS. In my case, of course, I don’t because, well,… CSS. But it’s easily interpreted as my saying “Oh, another passing fad, *Yawn*, give me RPG any day”. Similarly, I should be careful of saying things to Clojure programmers like, “Well the way we did that in my Lisp days was…” As a final example: this website looks like I’ve learned nothing about web technologies since 1994.

    People are sensitive to old people acting old. The flip side is that it’s easy to subvert expectations. I think it’s a good strategy to be able to talk in modest depth about two or three technologies that are sort of new or even faddish. So, for example, I’m pleased that I can talk about Clojure, Cappuccino, and, oh, Sinatra. You want to both present the appearance of being able to–and actually be able to–synthesize the past and the present.

  • Finally: if programming is indeed one of those fields where you do your best work young, older people should be paid less. Older programmers can compete by somehow (credibly) asking for less money.

    That “credibly” is an issue though, since programming is something of a macho, boastful field. In such, a declining salary is easily taken as a big red flag rather than a realistic acknowledgement that–say–short-term memory and concentration are important to programming and both get worse with age.

Other thoughts that might be of help to my fellow coffin-dodgers?

TDD & Functional Testing: from collections to scalars

I’ve been fiddling around with top-down (mock-style) TDD of functional programs off-and-on for a few months. I’ve gotten obsessed with deferring the choice of data structures as long as possible. That seems appropriate in a functional language, where we should be talking about functions more than data. (And especially appropriate in Clojure, my language of choice, since Clojure lets you treat maps/dictionaries as if they were functions from keys to values.)

That is, I like to write these kinds of tests:

(example-of "saturating a terrain"
   (saturated? (... terrain ...)) => true
   (because
      (span-between-markers (... terrain ...)) => (... sub-span ...)
      (saturated? (... sub-span ...)) => true
)

… instead of committing to what a terrain or sub-span look like. That’s been working reasonably well for me.

I’ve also been saying that “maps are getters”. By that, I mean that—given that you’ve test-driven raise-position—it really makes no more sense to test-drive this:

(defn raise [terrain]
   (map raise-position terrain))

… than it does to test a getter: it’s too obvious. That leads to a nice flow of testing: I’m always testing the transformation of things to other things. I don’t have to worry, until the very end of test-driving, that the “things” are actually complex data.

The problem I’ve been running into recently, though, is handling cases where complex data structures are converted into single values. For example, I’ve been trying to show a top-down TDD of Conway’s Life. In that case, I have to reduce a set of facts about the neighborhood of a cell into a single yes-or-no decision: should that cell be alive or dead in the next iteration? But expressing that fact is rather awkward when you don’t want to say precisely what a “cell” is or how you know it’s “alive” or “dead” (other than that there’s some function from a cell and its environment to a boolean).

To be concrete, here’s something I want to claim: a cell is alive in the next iteration if (1) it is alive now and (2) exactly two of the cells in its neighborhood are alive. How do you say that while being not-specific? I’ve not found a way that makes me happy.

Part of the problem, I think, is that when you start talking about individual elements of collections, you’re moving from the Land of TDD, which is a land of functions-of-constants to a Land of Quantified Variables (like “there exists an element of the collection such that…”). That way lies madness.

A sort of thought about interaction (and perhaps state-based) tests

This here post is about making tests terse by specifying what has happened instead of (as in interaction tests) who did it or (as in state-based test) the different kinds of things-it-has-happened-to.

I have a test that says that the Availability object should use the TupleCache object to get particular values for: all animals, animals that are still working, and animals that have been removed from service. If one wants to show animals that can be removed from service, it’s this:

all animals - animals still working - animals already removed from service

Here’s a mock-style test that describes how the Availability uses the TupleCache:

 should use tuple cache to produce a list of animals do
      @availability.override(mocks(:tuple_cache))
      during {
        @availability.animals_that_can_be_removed_from_service
      }.behold! {
        @tuple_cache.should_receive(:all_animals).once.
                     and_return([{:animal_name => out-of-service jake‘},
                                 {:animal_name => working betsy‘},
                                 {:animal_name => some…‘},
                                 {:animal_name => …other…‘},
                                 {:animal_name => …animals‘}])
        @tuple_cache.should_receive(:animals_still_working_hard_on).once.
                     with(@timeslice.first_date).
                     and_return([{:animal_name => working betsy‘}])
        @tuple_cache.should_receive(:animals_out_of_service).once.
                     and_return([{:animal_name => out-of-service jake‘}])
       }
      assert_equal([”…animals“, …other…“, some…“], @result)
    end

I’m not wild about the amount of detail in the test, but let’s leave that to the side. Notice that the results of the test imply that the Availability is turning the tuples (think of them as hashes or dictionaries) into a simple list of strings. Notice also that the list of strings is sorted. Noticing that brings a couple of questions to mind:

  • That sorting - does it use ASCII sorting, which sorts all uppercase characters in front of lowercase? or is it the kind of sorting the users expect (where case is irrelevant)?

  • Are duplicates stripped out of the result?

As it happens, I want the responsibility of converting tuples into lists to belong to another object. I’d prefer Availability to have only the responsibility of asking the right questions of the persistent data, not also of massaging the results. I’d like to put that responsibility into a Reshaper object. Here’s an expanded test that does that:

    should use tuple cache to produce a list of animals do
      @availability.override(mocks(:tuple_cache, :reshaper))
      during {
        @availability.animals_that_can_be_removed_from_service
      }.behold! {
        @tuple_cache.should_receive(:all_animals).once.
                     and_return([”…tuples-all…“])
        @tuple_cache.should_receive(:animals_still_working_hard_on).once.
                     with(@timeslice.first_date).
                     and_return([”…tuples-work…“])
        @tuple_cache.should_receive(:animals_out_of_service).once.
                     and_return([”…tuples-os…“])
        # New lines
        @reshaper.should_receive(:extract_to_values).once.
                  with(:animal_name, [’…tuples-work…‘], [”…tuples-os…“], [”…tuples-all…“]).
                  and_return([[”working betsy“], [’out-of-service jake‘],
                              [’working betsy‘, out-of-service jake‘,
                              some…‘, …other…‘, …animals‘]])
        @reshaper.should_receive(:alphasort).once.
                  with([’some…‘, …other…‘, …animals‘]).
                  and_return([”…animals“, …other…“, some…“])
      }
      assert_equal([”…animals“, …other…“, some…“], @result)
    end

It shows that the Availability method calls Reshaper methods which we could see (if we looked) guarantee the properties that we want. But I don’t like this test. The relationship between Availability and Reshaper doesn’t seem to me nearly as fundamental as that between Availability and TupleCache. And I hate the notion that the general notion of “convert a pile of tuples into a sensible list” is made so specific: it will make maintenance harder. And I’m not thrilled (throughout this test) of the way that the human reader must infer claims about the code from the examples.

So how about this?:

   should use tuple cache to produce a list of animals do
      @availability.override(mocks(:tuple_cache))
      during {
        @availability.animals_that_can_be_removed_from_service
      }.behold! {
        @tuple_cache.should_receive(:all_animals).once.
                     and_return([{:animal_name => out-of-service jake‘},
                                 {:animal_name => working betsy‘},
                                 {:animal_name => some…‘},
                                 {:animal_name => …other…‘},
                                 {:animal_name => …animals‘}])
        @tuple_cache.should_receive(:animals_still_working_hard_on).once.
                     with(@timeslice.first_date).
                     and_return([{:animal_name => working betsy‘}])
        @tuple_cache.should_receive(:animals_out_of_service).once.
                     and_return([{:animal_name => out-of-service jake‘}])
      }
      assert_equal([”…animals“, …other…“, some…“], @result)
      assert { @result.history.alphasorted }

The last line of the test claims that—at some point in the past—the result list has been “alphasorted”. A list that’s been alphasorted has the properties we want, which we can check by looking at the tests for the Reshaper#alphasort method.

In essence, we check whether at some point in the past the object we’re looking at has been “stamped” with an appropriate description of its properties. Therefore, we don’t have to construct test input that checks the various ways that description can become true - we simply trust earlier tests of what the stamp means.

Here’s code that adds the stamp:

    def result.history()
      @history = OpenStruct.new unless @history
      @history
    end
    result.history.alphasorted = true
    result.freeze

(Notice that I “freeze” the object. In Ruby, that makes the object immutable. That’s in keeping with my growing conviction that maybe programs should consist of functional code sandwiched between carefully-delimited bits of state-setting code.)

Having said all that, I suspect that the original awkwardness in the tests is a sign that I need a different factoring of responsibilities, rather than making up this elaborate solution. But I haven’t figured out what that factoring should be, so I offer the alternative for consideration.