“Editing” trees in Clojure with clojure.zip
Clojure.zip is a library that lets you move around freely within trees and also create changed copies of them. This is a tutorial I wish I’d had when I started using it.
In this tutorial, I’ll use sequences as trees. You can create your own kind of trees if you want, but I won’t cover that.
Here’s what we’ll be working with:
user=> (def original [1 '(a b c) 2]) #'user/original
It’s a tree whose root is a vector with three children: 1, the subtree (a,b,c)
and 2. We need to convert it into some sort of data structure that allows free movement. That’s done like this:
user=> (require '[clojure.zip :as zip]) nil user=> (def root-loc (zip/seq-zip (seq original))) #'user/root-loc
Notice that I used the alias zip
. If you refer
or use
clojure.zip, you’ll find yourself overwriting useful functions like clojure.core/next
.
Notice also that I explicitly wrapped the tree in a seq. If you use seq-zip on an unwrapped vector, you’ll get confusing results.
At this point, I’d describe root-loc
as “the location (or loc) of the root of the tomorrow tree.” I say “tomorrow tree” because it’s not a tree itself, but something that can later be converted into a tree. In reality, root-loc
names both the loc and the tomorrow tree, bundled up together, but I think it most straightforward to leave the tomorrow tree implicit.
(The actual data structure is called a “zipper”, which is a decent analogy for the actual implementation but didn’t help me understand how to use the library.)
Moving around inside a tomorrow tree
With the loc, we can move around:
user=> original [1 (a b c) 2] user=> (zip/node (zip/down root-loc)) 1
zip/down
moves to the leftmost child of the current loc and returns that child’s loc. zip/node
gives you the subtree of the original tree corresponding to its loc argument. It’s one of the main ways you get parts of a tree—regular lists, vectors, and the like—”out of” a tomorrow tree.
The ->
macro makes movement in the tomorrow tree much easier to understand:
user=> (-> root-loc zip/down zip/right zip/node) (a b c) user=> (-> root-loc zip/down zip/right zip/down zip/right zip/node) b
Nevertheless, I always wrap anything but the simplest traversals in their own functions.
Here are some other movement functions for you to try: up
, left
, rightmost
, and leftmost
. Beware of (arguable) inconsistencies in the handling of edge cases. For example, right
and rightmost
behave differently when moving “off the end” of a list of siblings:
user=> original [1 (a b c) 2] user=> (def last-one (-> root-loc zip/down zip/right zip/right)) #'user/last-one user=> (zip/node last-one) 2 user=> (-> last-one zip/right) nil ; off into nothingness user=> (-> last-one zip/rightmost zip/node) 2 ; stays put
Parts of a tree
In addition to zip/node
, there are other functions for recovering parts of the original tree. All work relative to the current loc.
user=> (def b (-> root-loc zip/down zip/right zip/down zip/right)) #'user/b user=> (zip/node b) b user=> (zip/lefts b) (a) user=> (zip/rights b) (c)
An interesting one shows all subtrees from the root of the tree down to just above the current loc:
user=> (zip/path b) [ (1 (a b c) 2) (a b c) ]
Changing the tree
A number of functions take a current loc (and associated tomorrow tree) and produce a new loc inside a different tomorrow tree. For example, let’s delete the ‘(a b c) subtree:
user=> (def loc-in-new-tree (zip/remove (zip/up b))) #'user/loc-in-new-tree
How does the tree represented by this new tomorrow tree differ from the original
tree? We can see that with the root
function, which applies zip/node
to the root of the tomorrow tree:
user=> (zip/root loc-in-new-tree) (1 2) user=> original [1 (a b c) 2]
Where, exactly, is the new location?
user=> (zip/node loc-in-new-tree) 1
The new loc has “backed up” from its previous version. (That’s not an exact enough description, but it’ll do for a few more paragraphs.) The other editing functions return an “unchanged” loc (except in the sense that it’s pointing into a new tomorrow tree with a changed structure): insert-left
, insert-right
, replace
, edit
, insert-child
, and append-child
. Try them out.
In which I reveal I’m slow
At first, I easily forgot that these functions create a new tomorrow tree and don’t really “replace” or “insert” or “edit” parts of the old one. That is:
user=> (zip/root loc-in-new-tree) (1 2) ; I see that I've edited the tree. user=> (zip/root b) (1 (a b c) 2) ; Wait - I thought I changed the tree?!
“Well, duh”, you might say, “it’s a functional language with immutable state, so of course it doesn’t change the old tree.” You’re absolutely right, but it was surprisingly easy for old habits to sneak up on me. So, two rules:
-
If you ever fail to use the return value of one of these functions, you’re doing it wrong.
-
If you ever write code like this:
(let [stashed-location (zip/whatever ...)] ... make "changes" ... ... use stashed location ...)
you might be doing it wrong. Make sure that you’re not unthinkingly assuming that later changes to “the” tree are reflected in your
stashed-location
.
Whole-tree editing
Here’s an example of printing out a whole tree, one node at a time:
(defn print-tree [original] (loop [loc (zip/seq-zip (seq original))] (if (zip/end? loc) (zip/root loc) (recur (zip/next (do (println (zip/node loc)) loc))))))
This is an ordinary recursive loop. It visits each location in the tomorrow tree, stopping when zip/end?
is true. zip/next
returns a new current loc that is the next one in the tomorrow tree, where “next” means “in preorder depth-first order”. To see that, here’s what one run of the function prints:
user=> (print-tree [1 '(a (i ii iii) c) 2]) (1 (a (i ii iii) c) 2) 1 (a (i ii iii) c) a (i ii iii) i ii iii c 2
To make changes to the tree, add a cond
. The default case should hand the current loc to zip/next
. The other cases should yield a loc pointing into a changed copy of the tomorrow tree:
(loop [loc (zip/seq-zip original-tree)] (if (zip/end?> loc) (zip/root loc) (recur (zip/next (cond (subtree-to-change? loc) (modify-subtree loc) … :else loc)))))
The tricky bit is making sure that modify-subtree
returns a loc just before the next loc of interest (in a depth-first traversal). (It has to be before so that zip/next
takes you to the interesting loc.) To get to that loc, you can use any of the movement functions (zip/next
, zip/up
, zip/rightmost
, and so on). There’s also a zip/prev
that returns the loc just before the current one.
To keep from confusing myself, I write little helper functions, each named by what it does and what loc it returns. So, for example, I have one function that glories in this name:
(defn wrap-with-expect__at-rightmost-wrapped-location [loc] (assert (start-of-arrow-sequence? loc)) (let [right-hand (-> loc zip/right zip/right) edited-loc (zip/edit loc (fn [loc] `(expect ~loc => ~(zip/node right-hand))))] (-> edited-loc zip/right zip/right zip/remove zip/remove)))
It takes a form like ... (f 1) => (+ 2 3) :next
, with the current loc being (f 1)
, and turns it into this:
... (expect (f 1) => (+ 2 3)) :next
… with the current loc being at the 3 so that the zip/next
returns a loc at :next
. This positioning works because I use zip/remove
, which returns a loc that’s “backed up” to the previous loc in a depth-first traversal. (That’s the fix to my earlier imprecision about what zip/remove
returns. It’s not the previous loc at the same level, which—for the sake of a simpler explanation—I earlier allowed you to assume.)
By building and testing these little functions first, my main cond-loop is easier to get right. You can see some more examples in my test package, Midje. You can look at both the tests and the code.
September 1st, 2010 at 3:03 pm
[…] “Editing” trees in Clojure with clojure.zip (tags: clojure zip tree) […]
September 2nd, 2010 at 2:02 pm
This is great and very helpful. One nitpick that tripped me up though…in the function passed to zip/edit, that function takes a *node*, not a *loc* so I would change that from
(zip/edit loc (fn [loc] ….))
to
(zip/edit loc (fn [node] …))
If you look at the source for zip/edit, you can see that it does (node loc) before calling the f you pass it. It also expects you to *return* a node from f, not a loc. zip/edit will return a loc though.
October 26th, 2010 at 12:24 am
[…] zipper tutorial (here, via […]
July 29th, 2011 at 12:11 pm
[…] examples of Marick’s teaching style, read his explanations of the Zipper and Enlive libraries or watch his videos on Top-down TDD in Clojure and […]
September 14th, 2011 at 6:03 pm
[…] “Editing” trees in Clojure with clojure.zip by Brian Marick. […]
September 29th, 2011 at 7:12 pm
[…] response will prompt me to reduce my effort around Clojure—to resist the temptation to write another tutorial, to cut back on new Midje features (but not on fixes and the 1.3 port, I hasten to […]
December 29th, 2011 at 10:39 am
[…] isn’t a huge win, but as long as I have people providing me with data structures like zippers, it’s not really any harder than fooling with mutable state. But I don’t see why I need […]