Tuesday, May 25, 2010

Huet's zipper

In this post, rather than pushing the boundaries, i'm going to take a step back and review Huet's original proposal for the Zipper. His idea was published in this functional pearl. My motivations for doing this are manifold. Firstly, i don't think anyone has written up a transliteration into Scala, yet (though, the ScalaZ guys do have an implementation). Secondly, i'm curious regarding how much interest in and understanding of the Zipper there is in the Scala community. Thirdly, this is in preparation for a radical idea in which nominal access (access by name) and positional access (access by location) are put into correspondence by construction. This leads to some pretty wild stuff that i've been contemplating for nearly two years, now.

The origin of the zipper rests in the desire to provide an efficient functional representation of a "structure" editor. For example, we might consider navigation and destructive modification of a tree. In a functional representation destructive operations need to be replaced by copying. Done naively, this can be very expensive. In his functional pearl Huet describes a generic approach to the problem of an applicative structure editor. He dubbed it the zipper. The key idea is to denote the location of a position in a tree by splitting it into two pieces: the subtree of focus and the context in which it appears.


To render this idea in Scala suppose that we have modeled the
type of a tree as






 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
trait Tree[A]
class TreeItem[A]( val item : A ) extends Tree[A]
object TreeItem {
def apply[A]( item : A ) = { new TreeItem( item ) }
def unapply[A]( tree : TreeItem[A] )
: Option[( A )] = {
Some( ( tree.item ) )
}
}
class TreeSection[A](
val section: List[Tree[A]]
) extends Tree[A]
object TreeSection {
def apply[A]( section : List[Tree[A]] ) = { new TreeSection( section ) }
def unapply[A]( tree : TreeSection[A] )
: Option[( List[Tree[A]] )] = {
Some( ( tree.section ) )
}
}




Then we might model the context by hand as



 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
trait Context[A]
case class Top[A]( ) extends Context[A]
class TreeContext[A](
val left : List[Tree[A]],
val ctxt : Context[A],
val right : List[Tree[A]]
) extends Context[A]
object TreeContext {
def apply[A](
left : List[Tree[A]],
ctxt : Context[A],
right : List[Tree[A]] ) = {
new TreeContext( left, ctxt, right )
}
def unapply[A]( ctxt : TreeContext[A] )
: Option[( List[Tree[A]], Context[A], List[Tree[A]] )] = {
Some( ( ctxt.left, ctxt.ctxt, ctxt.right ) )
}
}



The context is "tree to the left", "path to a place where we might put a subtree", "tree to the right". Coding the intuitive idea of location is now straightforward. It's a pair: the context together with the subtree that "plugs into it".



 1
2
3
4
5
6
7
8
9
10
11
12
13
class Location[A](
val tree : Tree[A],
val ctxt : Context[A]
)
object Location {
def apply[A]( tree : Tree[A], ctxt : Context[A] ) = {
new Location( tree, ctxt )
}
def unapply[A]( loc : Location[A] )
: Option[( Tree[A], Context[A] )] = {
Some( ( loc.tree, loc.ctxt ) )
}
}



This encoding allows us some generic navigation code.



 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
trait ZipperNavigation[A] {
def left( location : Location[A] ) : Location[A] = {
location match {
case Location( _, Top( ) ) => {
throw new Exception( "left of top" )
}
case Location( t, TreeContext( l :: left, up, right ) ) => {
Location( l, TreeContext( left, up, t :: right ) )
}
case Location( t, TreeContext( Nil, up, right ) ) => {
throw new Exception( "left of first" )
}
}
}
def right( location : Location[A] ) : Location[A] = {
location match {
case Location( _, Top( ) ) => {
throw new Exception( "right of top" )
}
case Location( t, TreeContext( left, up, r :: right ) ) => {
Location( r, TreeContext( t :: left, up, right ) )
}
case Location( t, _ ) => {
throw new Exception( "right of last" )
}
}
}
def up( location : Location[A] ) : Location[A] = {
location match {
case Location( _, Top( ) ) => {
throw new Exception( "up of top" )
}
case Location( t, TreeContext( left, up, right ) ) => {
Location( TreeSection( left.reverse ::: ( t :: right ) ),
up )
}
}
}
def down( location : Location[A] ) : Location[A] = {
location match {
case Location( TreeItem( _ ), _ ) => {
throw new Exception( "down of item" )
}
case Location( TreeSection( Nil ), ctxt ) => {
throw new Exception( "down of empty" )
}
case Location( TreeSection( u :: trees ), ctxt ) => {
Location( u, TreeContext( Nil, ctxt, trees ) )
}
}
}
}



Given the ubiquity of tree-like data structures in computing we can put this to work in lots of situations. Consider, for example, a simple abstract syntax tree representation.




 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
case class Token[A](
override item : A
) extends TreeItem[A]( item )
case class AST[A](
override section : List[Tree[A]]
) extends TreeSection[A]( section )

object Exercise extends ZipperNavigation[String] {
val arithmeticExpr1 =
AST[String](
List(
AST[String](
List(
Token[String]( "a" ),
Token[String]( "*" ),
Token[String]( "b" )
)
),
Token[String]( "+" ),
AST[String](
List(
Token[String]( "c" ),
Token[String]( "*" ),
Token[String]( "d" )
)
)
)
)
val locationOf2ndMult =
Location[String](
Token[String]( "*" ),
TreeContext[String](
List( Token[String]( "c" ) ),
TreeContext[String](
List(
Token[String]( "+" ),
AST[String](
List(
Token[String]( "a" ),
Token[String]( "*" ),
Token[String]( "b" )
)
)
),
Top( ),
List( )
),
List( Token[String]( "d" ) )
)
)
def show( depth : Int )( tree : Tree[String] ) : Unit = {
tree match {
case TreeItem( item : String ) => {
val indent =
( "" /: (1 to depth) )( { ( acc, d ) => acc + " " } )
println( indent + "Leaf : " + item )
}
case TreeSection( section : List[Tree[String]] ) => {
for( t <- section ){ show( depth + 2 )( t ) }
}
}
}
}



An instance of an AST for the arithmetic expression a*b + c*d would be given by the first value in the object above; and the location of the second multiplication sign, "*", is represented by the second value.


Now, we can navigate around this example.



 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
scala> import Exercise._
import Exercise._
import Exercise._

scala> show( 0 )( arithmeticExpr1 )
show( 0 )( arithmeticExpr1 )
Leaf : a
Leaf : *
Leaf : b
Leaf : +
Leaf : c
Leaf : *
Leaf : d

scala> show( 0 )( locationOf2ndMult.tree )
show( 0 )( locationOf2ndMult.tree )
Leaf : *

scala> show( 0 )( up( locationOf2ndMult ).tree )
show( 0 )( up( locationOf2ndMult ).tree )
Leaf : c
Leaf : *
Leaf : d

scala> show( 0 )( up( up( locationOf2ndMult ) ).tree )
show( 0 )( up( up( locationOf2ndMult ) ).tree )
Leaf : a
Leaf : *
Leaf : b
Leaf : +
Leaf : c
Leaf : *
Leaf : d

scala> show( 0 )( up( up( up( locationOf2ndMult ) ) ).tree )
show( 0 )( up( up( up( locationOf2ndMult ) ) ).tree )
java.lang.Exception: up of top
...
scala>




Of course, the real desiderata are the mutation functions.


 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
trait ZipperMutation[A] {
def update(
location : Location[A],
tree : Tree[A]
) : Location[A] = {
location match {
case Location( _, ctxt ) =>
Location( tree, ctxt )
}
}
def insertRight(
location : Location[A],
tree : Tree[A]
) : Location[A] = {
location match {
case Location( _, Top( ) ) => {
throw new Exception( "insert of top" )
}
case Location(
curr,
TreeContext( left, up, right )
) => {
Location(
curr,
TreeContext( left, up, tree :: right )
)
}
}
}
def insertLeft(
location : Location[A], tree : Tree[A]
) : Location[A] = {
location match {
case Location( _, Top( ) ) => {
throw new Exception( "insert of top" )
}
case Location(
curr,
TreeContext( left, up, right )
) => {
Location(
curr,
TreeContext( tree :: left, up, right )
)
}
}
}
def insertDown(
location : Location[A], tree : Tree[A]
) : Location[A] = {
location match {
case Location( TreeItem( _ ), _ ) => {
throw new Exception( "down of item" )
}
case Location(
TreeSection( progeny ),
ctxt
) => {
Location(
tree,
TreeContext( Nil, ctxt, progeny )
)
}
}
}
def delete(
location : Location[A], tree : Tree[A]
) : Location[A] = {
location match {
case Location( _, Top( ) ) => {
throw new Exception( "delete of top" )
}
case Location(
_,
TreeContext( left, up, r :: right )
) => {
Location(
r,
TreeContext( left, up, right )
)
}
case Location(
_,
TreeContext( l :: left, up, Nil )
) => {
Location(
l,
TreeContext( left, up, Nil )
)
}
case Location(
_,
TreeContext( Nil, up, Nil )
) => {
Location( TreeSection( Nil ), up )
}
}
}
}



In the next post we'll show how we don't have to design the Context and Location types by hand. They can be derived from the container type -- in this case, Tree.

13 comments:

Jesper Nordenberg said...

Interesting concept, although I don't fully grasp it yet. Could it, for example, be used for trees of type safe record-like objects with support for nominal access and generic update methods (with the help of HList and other Scala type hackage)?

I'm thinking something like this (in pseudo code):

type Trunk { contents : Any }
type Car { trunk : Trunk }
c = Car { trunk = Trunk { contents = "dog" } }
c2 = c.trunk.contents <- "cat"

In Scala today you have to write something like this (assuming that corresponding case classes are defined):

c2 = c.copy(trunk = c.trunk.copy(contents = "cat"))

which becomes rather tiresome and error prone for larger data structures.

Maybe I'm out swimming on deep water here ;)

leithaus said...

Dear Jesper, That's exactly the situation zipper is intended to address. Best, --greg

Luc Duponcheel said...

great post

the idea can probably be generalized to any polynomial functor

(a binary tree corresponds to the polynomial 1 + a * x^2)

encoding that generalization in Scala is probably more challenging (I did it once in Haskell)

Luc

leithaus said...

Dear Luc, Thanks for the feedback! i implemented the notion of polynomial functor and differentiation here. The next post explores this idea. Best, --greg

Simon Chemouil said...

Great post.

For the record, the Zipper was already presented on Planet Scala blog at least once by Stefan Zeiger proposing an alternative immutable design for the Scala XML library.

http://szeiger.de/blog/2009/12/27/a-zipper-for-scala-xml/

leithaus said...

Dear Simon, Thanks! i didn't know about the Zeiger post on planet Scala. i'll go give it a read. Best wishes, --greg

Rich said...

I'm a bit late to the party, but I'm curious about how to handle the example Jesper suggested.

I love the idea of using a Zipper to update an object graph like that, but I'm just coming up short on exactly how to implement it when the graph is heterogenous, like many object graphs tend to be.

Marius Danciu said...

Greg,

I don't know what took me so long in reading your post in depth. So far I find it to be the best explanation of zippers out there.

Many thanks,
Marius

Marius Danciu said...

Greg,

I don't know what took me so long in reading your post in depth. So far I find it to be the best explanation of zippers out there.

Many thanks,
Marius

Marius Danciu said...

Hi Greg,

I was also playing with Zipper on N-ary trees using a slightly different tree representation. However both my implementation and yours suffers from an interesting issue: lack of persistence after mutations in certain cases.

If you do an update of a node then navigate up and then back down the update is lost. Same for insert ans delete. If you navigate left/right the mutation is correctly reflected since the Context reflects those.

The only solution that I see is to persist mutation when moving up and then back down, is to either create new nodes till the root on each up operation (which can be expensive) or create new parent nodes on the fly as we move up. However it is not enough to do this when moving up, because one can use other routes than moving directly up: such as left -> up -> right. The 'up' here is a sibling on the current node's parent. the 'right' is the current node's parent.

I'd love to see your thoughts on this.

Thanks,
Marius

Marius Danciu said...

Oh another potential solution is to memoize locations.

Marius Danciu said...

Here is an update version that persists the change:

def updateAndPersist(t: Tree[A]): Option[Location[A]] = this match {
case Location(_, NodeContext(l, ploc @ Location(Node(a, _), ctx), r)) =>
for (newp <- ploc.updateAll(Node(a, l ::: List(t) ::: r))) yield {
Location(t, NodeContext(l, newp, r))
}
case Location(_, ctx) => Some(Location(t, ctx))
}


moving up and down would reflect the new node as we essentially build a new tree up to the root.


Similarly we can implement inserts and delete. These new mutators that persists the changes in new trees could be represented as companions for the original Huet's mutators.



Marius

My Notes said...

why the up method written like:
Location( TreeSection( left.reverse ::: ( t :: right ) ),up )

if you move the subtree focus to it's parent, then it's parent got the focus, instead of passing the parent node to the Location, you passed a TreeSection(), it's a group of children of the parent node, not the parent node. Can you explain to me why?