Two phase release planning

A twitter conversation between Zee Spencer and Ron Jeffries makes me think I’ve never written down my two-phase approach to release planning.

Phase 1: Someone wants something. They have a problem to be solved by a software system. There are probably a few key features that are needed, perhaps buried in a mass of requirements that pretend to more authority than they’ll turn out to have. If the key features aren’t delivered, the system isn’t worth having. (Or the new release of an existing system isn’t worth having - it doesn’t matter.)

The person who wants the system (or wants to sell it) might have a release date in mind. It might be based on real constraints, like the beginning of a school year or the Christmas sales season. Or it might be arbitrary. Doesn’t matter to me.

At this point, the development team’s job is to answer the question: can this team produce a respectable implementation of that system by that date?

“Respectable” is a somewhat tricky idea. One conceptually easy part of it is “provides the key features”. Another is, embarrassingly, fuzzier: more about whether something can be provided that feels like a “whole” rather than a collection of parts. This is the kind of thing that Jeff Patton’s story mapping is about. It also includes Jeff’s idea of the distinction between incremental development (where you develop by bolting on features one at a time) and iterative development (where you get something complete-for-some-real-purpose done quickly but crudely, then add flesh to the walking skeleton.)

The answer to the “can it be done by that date?” question is going to be something of a leap of faith. The estimate that “we can do that by then” isn’t going to be as reliable as, say, the use of yesterday’s weather in iteration planning.

One way to get a better answer is to just go ahead and start the project. As Cem Kaner has said, you know less about the project right now than you ever will again. Spend a month building the product, then ask: can it be built by the desired date? You’ll get a much better answer. [Note: I’d favor really starting to build the project, just as if you’d made a two-year commitment, rather than doing some pilot study. I’d be suspicious that a pilot study would ignore a lot of the grinding details that take up a lot of project time.] The risk here, though, is that an answer of “well, it turns out we won’t be able to do it” will be unacceptable.

(Whether estimating at the beginning or after a month, I’d personally err on the conservative side because my experience is (1) the development team is more likely to be optimistic than pessimistic, and (2) I favor pleasant surprises - “We can do more! Or release sooner!” - over unpleasant ones.)

Phase 2: At this point, there’s a commitment: a respectable product will be released on a particular date. Now those paying for the product have to accept a brute fact: they will not know, until close to that date, just what that product will look like (its feature list). What they do know is that it will be the best product this development team can produce by that date. It’ll be the best product because the team commits to being flexible enough to put features into the product in any order. Therefore, the most valuable features can go in first, then the less valuable ones, then…

So, in this phase, you can stop worrying about anything but the near horizon. For much of the project, all that’s required is occasional stock-taking. (”Do we still think we’ll have a respectable product by the release date?”) Toward the end, there might be some need to predict more precisely than “best possible, given this team and that date”. Not everything that needs to be done before release might be as changeable as the code. (It’s harder to un-train a salesperson than to turn off the code for a feature that won’t be done in time.) But you’ll be in a good position to make good predictions by then.

——–

My success selling this approach has been mixed. People really like the feeling of certainty, even if it’s based on nothing more than a grand collective pretending.

Some thoughts on classes after 18 months of Clojure

I had some thoughts about classes that wouldn’t fit into a talk I’m building about functional programming in Ruby, so I recorded them as a video.

Topics:

  • Using hashes instead of classes.

  • Classes as a documentation tool—specifically, as a way of making functions easy to find.

  • Preferring module inclusion to subclassing (which is akin to preferring adjectives to nouns as a way of organizing the documentation of verbs). (Vaguely similar to duck-typing in Haskell.)

  • Object dot notation as a more readable way of writing function composition. (Similar to the motivation for the -> macro in Clojure or type-directed name resolution in Haskell.)

On mutable state

I’m working on a talk called “Functional Programming for Ruby Programmers”. While doing so, I’ve somewhat changed my opinion about mutable state. Here’s my argument, for your commentary. Being new, it’s probably at best half-baked.

  1. Back in the early days of AIDS (the disease), I remember a blood-supply advocate saying “You’ve slept with everyone that everyone you’ve slept with ever slept with.”

    The nice thing about immutable state is that it stays virginal and knowable. It is what it was when it was created. With mutable state, it’s more like “Your code might be infected by any code that ever touched your data before (or after!) you first hooked up with it.”

    However, I don’t personally find that a huge problem in programming, debugging, or understanding code. There are other problems I’d rather see fixed first.

  2. It’s often said that “code with immutable state is easier to reason about”. I realized a long time ago that there’s some sleight-of-common-language going on there, like the way people who wanted code to be less branchy got rhetorical leverage by renaming branchiness “complexity“.

    In the claim, “reason about” is (I believe) being taken to be synonymous with “prove theorems about”. Pace a whole long trend in artificial intelligence, I don’t believe theorem proving is the basis for, nor a good analogy to, reasoning. Thus I took the “reason about” statement to be a sort of solipsistic wankery from people with roots in the theorem-proving community, not an argument that should sway me, Pragmatic Man!, who left the world of proofs-of-design-correctness around 1984.

    What I’d forgotten is that optimization is theorem proving. One can only optimize if there’s a proof that the optimized code computes the same result as the original. If you consider the desire to implement, say, lazy sequences efficiently, you see how the guarantees that immutability gives are important. Ditto for automatically spreading work to multiple cores.

  3. So:

    My previous attitude was pretty much “Yeah, I have a preference for avoiding mutable state. Being immutable isn’t a huge win, but as long as I have people providing me with data structures like zippers, it’s not really any harder than fooling with mutable state. Still, I don’t see why I need an immutable language. If I don’t want to mutate my hashmaps, I just won’t mutate my hashmaps.”

    My new attitude is: “Oh! I see why I need an immutable language.”

I still don’t make a huge deal about immutability. Its benefit is greater than its cost, sure, but it’s not the Thing That Will Save Us.

Here are three Ruby functions…

Here are three Ruby functions. Each solves this problem: “You are given a starting and ending date and an increment in days. Produce all incremental dates that don’t include the starting date but may include the ending date. More formally: produce a list of all the dates such that for some n >= 1, date = starting_date + (increment * n) && date < = ending_date.

Solution 1:

Solution 2:

Solution 3 depends on a lazily function that produces an unbounded list from a starting element and a next-element function. Here’s a use of lazily:

As a lazy sequence, integers both (1) doesn’t do the work of calculating the ith element unless a client of integers asks for it, and also (2) doesn’t waste effort calculationg any intermediate values more than once.

Solution 3:

The third solution seems “intuitively” better to me, but I’m having difficulty explaining why.

The first solution fails on three aesthetic grounds:

  • It lacks decomposability. There’s no piece that can be ripped out and used in isolation. (For example, the body of the loop both creates a new element and updates an intermediate value.)

  • It lacks flow. It’s pleasing when you can view a computation as flowing a data structure through a series of functions, each of which changes its “shape” to convert a lump of coal into a diamond.

  • It has wasted motion: it puts an element at the front of the array, then throws it away. (Note: you can eliminate that by having current start out as exclusive+increment but that code duplicates the later +=increment. Arguably, that duplicated increment-date action is wasted (programmer) motion, in the sense that the same action is done twice. (Or: don’t repeat yourself / Eliminate duplication.))

The second solution has flow of values through functions, but it wastes a lot of motion. A bunch of dates are created, only to be thrown away in the next step of the computation. Also, in some way I cannot clearly express, it seems wrong to stick the inclusive_end between the exclusive_start and the increment, given that the latter two are what was originally presented to the user and the inclusive_end is a user choice. (Therefore shouldn’t the exclusive_start and increment be more visually bound together than this solution does?)

The third solution …

  • … is decomposable: the sequence of dates is distinct from the decision about which subset to use. (You could, for example, pass in the whole lazy sequence instead of a exclusive_start/increment pair, something that couldn’t be done with the other solutions.)

  • … eliminates wasted work, in that only the dates that are required are generated. (Well, it does store away a first date — excluded_start — that is then dropped. But it doesn’t create an excess new date.)

  • … has the same feel of a data structure flowing through functions that #2 has.

So: #3 seems best to me, but the advantages over the other two seem unconvincing (especially given that programmers of my generation are likely to see closure-evaluation-requiring-heap-allocation-because-of-who-knows-what-bound-variables as scarily expensive).

Have you better arguments? Can you refute my arguments?

I’m trying to show the virtues of a lazy functional style. Perhaps this is a bad example? [It’s a real one, though, where I really do prefer the third solution.]

TDD Workflow (Sinatra / Haml / jQuery) Part 1

Introduction

This is a draft. Worth continuing the series? Let me know.

Critter4Us is a webapp used to schedule teaching animals at the University of Illinois Vet School. Its original version was a Ruby/Sinatra application with a Cappuccino front end. Cappuccino lets you write desktop-like webapps using a framework modeled after Apple’s Cocoa. I chose it for two reasons: it made it easy to test front-end code headlessly (which was harder back then than it is now), and it let me reuse my RubyCocoa experience.

Earlier this year, I decided it was time for another bout of experimentation. I decided to switch from Cappuccino to jQuery, Coffeescript, Haml because I thought they were technologies I should know and because they’d force me to develop a new TDD workflow. I’d never gotten comfortable with the testing—or for that matter, any of the design—of “traditional” front-end webapp code and the parts of the backend from the controller layer up.

I now think I’ve reached a plateau at which I’m temporarily comfortable, so this is a good time to report. Other people might find the approach and the tooling useful. And other people might explain places where my inexperience has led me astray.
Read the rest of this entry »

Anarchy of Evasion

“The anarchy of evasion” is a term that (I think) I read a few years ago. I occasionally search for it, but I’ve never been able to find anything about it again. As I recall, the idea is a reaction to the fact that an anarchist movement that forms an independent new society or transforms an existing society is… hard. So an alternative is to live within an existing society but try to make as many as possible of your activities and interactions run according to the principles you wished the whole society ran with. You behave as you wish, but in a way and context that increases the chance you won’t be noticed and therefore stopped.

The idea appeals to me, perhaps because I’ve always liked the idea of complex and interesting things happening unnoticed in the cracks. Probably a result of reading The Littles as a kid and books like You’re All Alone and Neverwhere later.

Parallels with things like open source, intentional communities, squatting, and freaks and geeks in US high schools are left as an exercise for the reader.

If anyone has links to the idea, let me know.

Using functional style in a Ruby webapp

Motivation

Consider a Ruby backend that communicates with its frontend via JSON. It sends (and perhaps receives) strings like this:

Let’s suppose it also communicates with a relational database. A simple translation of query results into Ruby looks like this:

(I’m using the Sequel gem to talk to Postgres.)

On the face of it, it seems odd for our code to receive dumb hashes and arrays, laboriously turn them into model objects with rich behavior, fling some messages at them to transform their state, and then convert the resulting object graph back into dumb hashes and arrays. There are strong historical reasons for that choice—see Fowler’s Patterns of Enterprise Application Architecture—but I’m starting to wonder if it’s as clear a default choice as it used to be. Perhaps a functional approach could work well:

  • Functional programs focus on the flow of data through code, rather than on objects with changing state. The former seems more of a match for a typical webapp.

  • It’s common in functional languages to lean toward a few core datatypes—like hashes and arrays—that are operated on by a wealth of functions. We could skip the conversion step into objects. Rather than having to deal with the leaky abstraction of an object-relational mapping layer, we’d embrace the nature of our data.

Seems plausible, I’ve been thinking. However, I’ve never been wildly good at understanding the problems of an approach just by thinking about it. It’s more efficient for me to learn by doing. So I’ve decided to strangle an application whose communication with its database is, um, labored.

I’m going to concentrate on two things:

  • Structuring the code. More than a year of work on Midje has left me still unhappy about the organization of its code, despite my using Kevin Lawrence’s guideline: if you have trouble finding a piece of code, move it to where you first looked. I have some hope that Ruby’s structuring tools (classes, modules, include, etc.) will be useful.

  • Dependencies. As you’ll see, I’ll be writing code with a lot of temporal coupling. Is that and other kinds of coupling dooming me to a deeply intertwingled mess that I can’t change safely or quickly?

This blog post is about where I stand so far, after adding just one new feature.
Read the rest of this entry »

Top-down design in “functional classic” programming

While waiting for my product owner to get back to me, I was going through open browser tabs. I read this from Swizec Teller:

The problem is one of turning a list of values, say, [A, B, C, D] into a list of pairs with itself, like so [[A,B], [A,C], [A,D], [B, C], [B,D], [C,D]].

He had problems coming up with a good solution. “I can do that!” I said, launched a Clojure REPL, and started typing the whole function out. I quickly got bogged down.

This, I think, is a problem with a common functional approach. Even with this kind of problem—the sort of list manipulation that functional programming lives for—there’s a temptation to build a single function bottom up. “I need to create the tails of the sequence,” I thought, “because I know roughly how I’ll use them.”

For me (and let’s not get into all that again), it usually works better to go top down, mainly because it lets me think of discrete, meaningful functions, give them names, and write facts about how they relate to one another. So that’s what I did.

First, what am I trying to do? Not create all the pairs, but only ones in which an element is combined with another element further down the list. Like this:

As I often do when doing list-manipulation problems, I lay things out to visually emphasize the “shape” of the solution. That helps me see more clearly what has to be done to create that solution. There’s one set of pairs, each headed by the first element of the list, then another set, each headed by the second element. That is, I can say the above fact is true provided two other facts are true:

It’s easy to see how I get the heads—just map down the list (maybe I have to worry about the last element, maybe not—I’ll worry about it if it comes up). What are those heads combined with? Just the successive tails of the original list: [2 3], [3]. I’ll assume there’s a function that does that for me. That gives me this entire fact-about-the-world-of-this-program:

Given all that, downward-pairs is easy enough to write:

It’s important to me that I have reached a stable point here. When building up a complicated function from snippets in a REPL, I more often create the wrong snippets than when I define what those snippets need to do via a function that uses them. (I’m not saying that I never backtrack: I might find a subfunction too hard to write, or I may have carved up the world in a way that leads to gross inefficiency, etc. I’m saying that I seem to end up with correct answers faster this way.)

Now I just have to solve two simpler problems. Tails I’ve done before, and here’s a solution I like:

For those who don’t know Clojure well, this produces this sequence of calls to drop:

This does roughly the same work as an iteration would do because of how laziness is implemented.

That leaves me headed-pairs, which is a pretty straightforward map:

This seems like a reasonable solution, doesn’t strike me as being terribly inefficient (given laziness), has a readability that I like, doubtless took me less time than developing it bottom-up would have, and comes with tests (including an end-to-end test that I won’t bother showing).

The whole solution is here.

UPDATE: Yes, I could have used do or even gotten explicit with the sequence-m monad, but that wouldn’t have addressed the original poster’s point, which I took to be how to think about functional problems.

How mocks can cut down on test maintenance

After around 11 months of not working on it, I needed to make a change to Critter4us, an app I wrote for the University of Illinois vet school. The change was simple. When I tried to push it to Heroku, though, I discovered that my Ruby gems were too out of date. So, I ended up upgrading from Ruby 1.8 to 1.9, to Sinatra 1.3 from a Sinatra less than 1.0, to a modern version of Rack, etc. etc. In essence, I replaced all the turtles upon which my code-world was resting. There were some backwards-compatibility problems.

One incompatibility was that I was sending an incorrectly formatted URI to fetch some JSON data. The old version of Rack accepted it, but the new one rejected it. The easy fix was to split what had been a single `timeslice` parameter up into multiple parameters. [Update: I later did something more sensible, but it doesn’t affect the point of this post.] “Crap!”, I thought. “I’m going to have to convert who knows how much test data.” But I was pleased when I looked at the first test and saw this:

The key point here is that neither the format of the URI parameters nor the resulting timeslice object is given in its real form. Instead, they’re represented by strings that basically name their type. (In my Clojure testing framework, Midje, I refer to these things as “metaconstants“.)

The only responsibility this code has toward timeslices is to pass them to another object. That object, the `internalizer`, has the responsibility for understanding formats. The test (and code) change is trivial:

The test is even (and appropriately) less specific than before. It says only that the GET parameters (a hash) will contain some key/value pairs of use to the internalizer. It’s up to the internalizer to know which those are and do the right thing.

The point here is that the size of the test change is in keeping with the size of the code change. It is unaffected by the nature of the change to the data—which is as it should be.

This application is the one where I finally made the important decision to use mocks heavily in the Freeman and Pryce “London” style and—most importantly—to not fall into the trap of thinking “Mocks are stupid!” when I ran into problems. Instead, I said “I’m stupid!” and, working on that assumption, figured out what I was doing wrong.

I made that decision halfway through writing the app. One of the happy results of the mocking that followed was that a vast amount of test setup code devoted to constructing complete data structures went away. No more “fixtures” or “object mothers” or “test factories.”

A postscript on expressiveness and performance

In some way, my original post makes it seem as if normal programmers are like the drug dealer D’Angelo Barksdale eating in a fancy restaurant in “The Wire“—everyone stared, the waiter rubbed D’Angelo’s ignorance of fine-restaurant customs in his face, and D’Angelo’s introspective attitude doesn’t hide a desperate desire to leave. Well, yes (though nothing in a programmer’s life is as bad as anything in a “The Wire” character’s life), but there’s more to it.

I think Clojure already suffers from being a language for hardcore programmers to solve hardcore problems. For example, that hardcore attitude makes things like nice stack traces seem unimportant. More: fishing through stack traces can almost become a perverse right of passage. Do you have what it takes to program in Clojure, punk?

The history of programming, as I’ve lived it, has been a march where expressiveness led the way and performance followed. I came of age when the big debate was whether us new-fangled C programmers and our limited compilers could ever replace the skilled assembly-language programmer. C won the debate. When Java came out, the debate was whether a language with byte codes and garbage collection could be fast enough to be a justifiable choice over C/C++. Java won the debate. Ruby is still being held up as too slow in comparison with Java, but even the pokey MRI runtime is fast enough for a whole lot of apps. Perhaps the emerging consensus is that Twitter-esque success will require migrating from Ruby to JVM infrastructure—and that’s exactly the kind of problem you want to have.

Against that background, it’s notable that Clojure 1.3 is taking a step backward in expressiveness in the service of performance. (By this, I mostly mean that performance—calling through vars, arithmetic operations—is the default and people who are worried about rebinding and bignums have to act specially. So my super-cool introduction-to-Clojure “Let’s take the sum of the first 3000 elements of an infinite sequence of factorials” example will become less impressive because arithmetic overflow is now in the programmer’s face.) That’s worrisome. There’s not much new in 1.3 to tempt the hitherto-reluctant Java or Ruby programmer. And unless I care about performance or am a utility library maintainer, I don’t see very much in 1.3 to tempt me away from 1.2.

Now, 1.3 is not a huge hit to expressiveness, and Clojure already has pretty much all the expressive language constructs that have stood the test of time (except, I suppose, Haskell-style pattern matching). So what else is left for someone of Rich Hickey’s talents to work on but performance? Documentation?

That’s a good point, but I worry about how performance will drive the constructs available to Clojure users. For example, multimethods feel a bit orphaned because of the drive to take advantage of how good the JVM is at type dispatch. There’s a whole parallel structure of genericity that’s getting the attention. And conj is a historical example: it’s hard to explain to someone how these two things both make sense:

user=> (conj [1 2 3] 4)
[1 2 3 4]
user=> (conj (map identity [1 2 3]) 4)
(4 1 2 3)

As with arithmetic overflow, it strikes those not performance-minded (those not hardcore) as odd to have lots of generic operations on sequences except for the case of putting something on the end. “So I have to keep track of what’s a vector? In a language that creates lazy sequences whenever it gets an excuse to?”

Catering to performance increases the cognitive load a language puts on its users. Again, the hardcore programmer will take it in stride. But Lisp and functional languages are different enough that I think Clojure needs more focused attention on the not-hardcore programmers.

Now, the quick retort is that Clojure isn’t my language. It’s a gift from Rich Hickie to the world. Who am I to demand more from him? That’s a good retort. But my counter-retort is that I demand nothing. I only suggest, based on my experience with the adoption of ideas and languages. It’s hard making something. More information (from users like me) is probably better than less.