Monday, November 9, 2009

A bridge between nominal and positional information

Many developers have been faced with the design trade-offs around nominal versus positional access. Nominal access allows one to be noncommittal about shape information of data types. For example, if the only access to data is via a collection of functions called cons, head, rest, ... many, many shapes support valid and useful interpretations. On the other hand, this sort of specification doesn't play well with traditional pattern-matching. That is, pattern-matching -- which is traditionally a form of positional access -- nails down the shape of data.

Recently, i found myself writing Scala code in the following pattern.

trait T[P1,...,Pm] {
def location1 : Type1[P1,...,Pm]
def locationN : TypeN[P1,...,Pm]

abstract class C[P1,...,Pm](
location1 : Type1
locationN : TypeN
) extends T

case class CC(
location1 : Type1
locationN : TypeN
) extends C

This spells out a clear relationship between nominal and positional access, but it made me wonder if one could have a tighter, clearer relationship. Could you insist, for example, that all nominal types (the types of things that represent names, variables, etc) encoded positional information?

Before showing how simply this can be done, let me provide a little additional motivation through chemistry -- literally. Walter Fontana once remarked to me that one of the real dividing lines between the practice of alchemy and chemistry was the development of the periodic table. Before the periodic table the names of the elements didn't (obviously) encode structural information about the elements and hence nominal information about mixtures of elements had no predictive power over the properties or behavior of said mixture. After the periodic table the names of elements effectively became their positions in the periodic table which reflected structural information about the elements they named. This enabled the practitioner to write equations of the form

Na + Cl => NaCl

2H + O => H2O

and make a formal theory of their meaning that had predictive properties of the physical world. That was a big step.

In my mind software development suffers because so much rests in the informal or contextual information encoded in nominal access to information -- such as variable names or function names. In point of fact many pieces of code are behaviorally equivalent and not recognized as such because they are encoded via nominal access that obscures this fact. On the other hand, purely positional access can be prohibitively inexpressive. Higher-level languages are clearly more understandable to humans than machine code precisely because of the ability to add contextual information in nominal access -- such as variable names or function names. Therefore, finding better relationships between nominal and positional access -- in particular understanding what the possible set of relations are -- seems like a worthy goal. Is there the equivalent of a transition from software-as-alchemy to software-as-chemistry?

McBride, et al's, recent work on the derivative of regular data types turns out to provide a step in that direction. Let me illustrate the idea via the lambda calculus. We can write down a domain equation for the terms in lambda calculus.

T[N] = N + N x T[N] + T[N] x T[N]

This domain equation can be viewed as a transcription of the standard syntactical presentation of the lambda calculus

T[N] ::= variable N | abstraction N T[N] | application T[N] T[N]

each summand in the top equation corresponds to it's labeled choice in the second. Ironically, reflecting the topic at hand, the first equation is purely structural while the second allows some nominal information to sneak in to help the human understand the contextual relevance of each structural option: the first choice of terms that are just names are variables; the second choice of terms that are functions from names to terms are lambda abstractions; the third choice of pairs of terms are applications.

The corresponding Scala code might look like

// Lambda terms are parametric in a notion of name
trait LambdaTerm[N]

// Variables are a kind of lambda term
case class Variable[N]( n : N )
extends LambdaTerm[N]

// Abstractions are a kind of lambda term
case class Abstraction[N](
formal : N, body : LambdaTerm[N]
) extends LambdaTerm[N]

// Applications are a kind of lambda term
case class Application[N](
fn : LambdaTerm[N], actual : LambdaTerm[N]
) extends LambdaTerm[N]

Note that we have acknowledged the data type is parametric in the notion of name. Whatever type supports the constraints of a nominal theory for the variables in the lambda calculus can be supplied. Now, given this parametric type, T[N], you can calculate the data type of contexts. This data type provides context and path information to allow "plugging in" of term to context. We write this ∂T[N] and refer to McBride's paper as an excellent resource for the reader who wishes to calculate this type for themselves. The key point is that this type represents positional information in type of lambda terms. So, to make a bridge between nominal and positional information all we have to do is to tie the recursive knot.

T[N] = N + N x T[N] + T[N] x T[N] 1
N = ∂T[N]

Note this step is not the end of the story, but rather the beginning. In particular, there are several desiderata.
  • What ties the particular use of a position as a name to something that is coherent or consistent? Specifically, what additional constraints are necessary to ensure that a nominal points to a location in the structure it uses?
  • Without specific additional term structure to make use of the positional information available in the nominal entity this is little more than another method of generating tokens. However, the recursive relationship makes nominal structure sensitive to modification of term structure.
In the rho-calculus, a π-calculus-like theory of processes in which the names of channels are quotations of processes, Radestock and i added the de-quotation of a name as a term. This form addresses the second concern and is easily adapted to the setting here. However, the first concern is really quite new. Of the many directions this question leads i'm captivated by the following.
  • How do DeBruijn indices relate to the solution to these recursive equations?
  • Can the contextual positional information be used to construct a better encoding of ambient calculus into π-calculi?
  • Can the correlation of nominal and positional information give a better way to correlate gravitation to metric?
[1] The reason we consider the data type

T[N] = N + NxT[N] + T[N]xT[N]

where we take a strictly syntactic view of lambda terms as opposed to say

T[N] = N + [N => T[N]] + T[N]xT[N]

is to keep clearly in the regular types differentiable in McBride's original paper.


Jesper Nordenberg said...

Nominal access doesn't just enhance readability (actually it doesn't enhance readability any more than defining index aliases), but also extensibility and re-use. Position access relies on a specific element order, and this is in many cases not only unnecessary, but also hindering.

Or maybe I'm completely misunderstanding your blog post. :)

leithaus said...

Thanks for your comment! There are two complicating factors. The first is that strictly nominal access eliminates pattern matching. If you don't do a lot of pattern-matching in your coding, it might not matter to you, but it does to me. Beyond that, we need to consider what a position is. That's the trick of using the derivative. This gives you a much more sophisticated notion of location in a data structure.

Jesper Nordenberg said...

Is there really any reason why pattern matching can't support nominal access? For example (using Scala syntax):

trait Point {
def x : Int
def y : Int

trait Line {
def start : Point
def end : Point

def f(l : Line) = l match {
case Line { start = Point { x = 10 } } => true
// Or using something similar to structural types:
case { start = { x = 10 } } => true
case _ => false

leithaus said...

It depends on where you want to draw the line. Is this still in the game?

trait Point {
def x( polar : ( Float, Float ) ) : Float
def y( polar : ( Float, Float ) ) : Float

This also touches on so-called active patterns. i've not seen a really clear proposal regarding how all this works. What constitutes reasonable nominal access + pattern-matching, what the semantics are, etc. If you've got a proposal or a reference, i'd love to look at it.

On the other side, we have some pretty cool notions about what a position is that lead to very generic data traversal capability. Further, this ties into behavioral theory in a very nice manner. Have you seen McBride's Clowns to the Left of Me Jokers to the right?

leithaus said...

BTW, you hit on an important point: nominal access and structural types go hand in hand.

Jesper Nordenberg said...

I haven't though or read much about how to implement nominal pattern matching, but at least some basic support seems simple to achieve in Scala. It's probably reasonable (but not necessary) to limit the matching to vals, vars and parameterless methods (which would rule out your example I suppose). There would be no need for unapply methods if matching is limited to public members only. The syntax for variable binding requires some thought, maybe @ can be reused:

case Line { s @ start : Point { x @ x } } if x < 10 => /* use s and x */

Support for structural typing would be nice, but it would lose some type safety compared to specifying type names (as multiple types can have the same member names).

Anyway, just an idea, don't know how useful it would be in practice.

gelisam said...

You seem to think that a value of type ∂T[N] represents a position within a term, but in fact it represents terms with one of the values of type N missing. The location of the hole reveals a position, but the value also reveals all the information around the hole, such as the shape of the term, and the surrounding values of type N.

If you just want to represent positions somehow, there is a much simpler way to do it. In fact, I refer you right back to the Derivatives of Containers paper you have linked. A container value consists of a shape, equipped with a mapping from positions within that shape to values of type X, the element type. That paper already assumes position types; it does not show how to construct them.

A position is like a key in a trie: it describes all the edges you need to follow to arrive at the position. According to this analogy, the type of trees is that of terms with unlabeled variables:

data Term : Set where
var : Term
lam : Term → Term
app : Term → Term → Term

You might have written this as (Term = 1 + Term + Term x Term).

Now, we want to define a type Pos whose values point to the occurrences of the var constructor. But occurrences in which term? You ask "what additional constraints are necessary to ensure that a nominal points to a location in the structure it uses", and that really isn't a new question at all. To ensure that a value of type Pos points within a specific term, you need to define Pos as a dependent type.

data Pos : Term → Set where
here : Pos var
within-body : (body : Term) → Pos body → Pos (lam body)
within-fun : (fun arg : Term) → Pos fun → Pos (app fun arg)
within-arg : (fun arg : Term) → Pos arg → Pos (app fun arg)

This makes a type like "Pos (lam var)" distinct from a type like "Pos (app var var)", preventing you from mistakingly giving a position pointing into "lam var" when you wanted a value pointing within the term "app var var".

But really, you don't want to represent variables using positions. Even less using terms with one hole.

leithaus said...

Hey! Thanks for your comment! i haven't had a chance to read it in detail except for the last point where you claim that i don't want to use positions for vars. In fact, i do! Perhaps you are not familiar with the rho-calculus. It('s syntax) has the following equational characterization

P,Q ::= x?(y)P | x!(P) | P|Q | *x
x,y ::= @P

In this calculus we achieve higher-order capability and get rid of replication and the new operator all for the cost of having names be the quotations of processes. This is a significant bang for buck.

There are several similar benefits from having positions be variables. As i said in the blog post, the idea is merely the beginning of the story, not the end.

gelisam said...

Using quotations for names? What a great idea! Seems inspired by Smullyan's robot names, which I liked a lot.

With this comparison in mind, I think I understand what you want a bit better. Here's part II of my trying to follow your thoughts, and then extending them to their logical conclusion.

You had a lot of success getting rid of variable names of the pi calculus, and now you want to do the same thing with the lambda calculus.

Now of course quotations aren't going to work this time. With the pi calculus, channel names allows a process in an input state to interact with a process in an output state, whereas the lambda calculus allows a term in a value position to interact with a term in a binding position. The critical difference is that the former interaction is between two distinct expressions, whereas the latter is between two parts of the same expression.

So you want something like quotations, but pointing inside an expression, specifically at a binding site within the expression in which the variable is being used. I think that why you think you want positions.

gelisam said...

Here's part III of me trying to follow and extend your thoughts.

Now, let me repeat that taking the derivative of a functor is not a good way to obtain a position type. When you take the derivative of T[N], you poke a hole in one of the Ns. That is, you're taking one of the variables out. So the holes in the expression "lam x (app (lam y y) x)", for example, are as follows (placing "@" in the hole).

1) lam @ (app (lam y y) x)
2) lam x (app (lam @ y) x)
3) lam x (app (lam y @) x)
4) lam x (app (lam y y) @)

Holes (1) and (2) are at binding sites, but holes (3) and (4) are not. Moreover, the type of one-hole contexts is too flexible, as it doesn't allow you to concentrate on the holes of "lam x (app (lam y y) x)", but on the contrary allows any and all expression to by used, as long as it has exactly one hole in it. You correctly pointed out the problem when you wrote "Specifically, what additional constraints are necessary to ensure that a nominal points to a location in the structure it uses?".

Again, the answer is to use a dependent type. You want to use a type like "Hole (lam x (app (lam y y) x))", which specifically states which expressions you are supposed to take ones into. Only values (1) through (4) would be values of this type, while values (5) through (7) below would have type "Hole (lam x (app x x))".

5) lam @ (app x x)
6) lam x (app @ x)
7) lam x (app x @)

Let's stop using variable names for now. We don't want to use names at all, but quotations or positions or something, so let's drop variable names for now and replace them with something else later, when we will have figured out what.

Values of type "Hole (lam . (app (lam . .) .))"
1) lam @ (app (lam . .) .)
2) lam . (app (lam @ .) .)
3) lam . (app (lam . @) .)

Values of type "Hole (lam . (app . .))"
4) lam . (app (lam . .) @)
5) lam @ (app . .)
6) lam . (app @ .)
7) lam . (app . @)

The first thing we want to do is to eliminate all values except for (1), (2) and (5), which are the only values representing binding positions. A natural way to do this is to split variables and holes into binding and referencing variants. To avoid confusion later on, let's use "@" for !-holes and "@@" for ?-holes.

Values of type "Hole! (lam ! (app (lam ! ?) ?))"
1) lam @ (app (lam ! ?) ?)
2) lam ! (app (lam @ ?) ?)

Values of type "Hole? (lam ! (app (lam ! ?) ?))"
3) lam ! (app (lam ! @@) ?)
4) lam ! (app (lam ! ?) @@)

Values of type "Hole! (lam ! (app ? ?))"
5) lam @ (app ? ?)

Values of type "Hole? (lam ! (app ? ?))"
6) lam ! (app @@ ?)
7) lam ! (app ? @@)

Now that we have a proper type for binding type positions, we can change our representation for variable references, keeping our representation of binding sites as is. Let's use the notation "@(...)" to use a value of type "Hole!" as a variable.

gelisam said...

Here's part IV.

Let's try to represent the term "lam x (app (lam y y) x)". Clearly, the variables "x" and "y" stand for the holes (1) and (2), respectively.

lam ! (app (lam ! @2) @1)

Here I have used @1 as a shorthand for "@(lam @ (app (lam ! ?) ?))", the value (1). Without the shorthand, the full representation is the following unwieldy beast.

lam ! (app (lam ! @(lam ! (app (lam @ ?) ?))) @(lam @ (app (lam ! ?) ?)))

We're now really close to our goal, except that we still accept nonsensical expressions like the following, which uses a position outside of its scope.

lam ! (app (lam ! @2) @2)

In order to fix this, we can refine our dependent types even more by parameterizing both on the expression we want to take our holes in, and on the current variable position. How do we represent the current variable position? With values of type "Hole?", of course!

Values of type "Hole! (lam ! (app (lam ! @@) ?))"
8) lam @ (app (lam ! @@) ?)
9) lam ! (app (lam @ @@) ?)

Values of type "Hole! (lam ! (app (lam ! ?) @@))"
10) lam @ (app (lam ! ?) @@)

Values of type "Hole! (lam ! (app @@ ?))"
11) lam @ (app @@ ?)

Values of type "Hole! (lam ! (app ? @@))"
12) lam @ (app ? @@)

Now let's make a series of simplifying transformations to the above solution. First, let's drop "app" completely, keeping only the branch in which the "@@" hole is.

Values of type "Hole! (lam ! (lam ! @@))"
8) lam @ (lam ! @@)
9) lam ! (lam @ @@)

Values of type "Hole! (lam ! @@)"
10) lam @ @@

Values of type "Hole! (lam ! @@)"
11) lam @ @@

Values of type "Hole! (lam ! @@)"
12) lam @ @@

Next, let's replace the series of nested (lam ! ...) by a single number keeping track of the number of levels of nesting.

Values of type "Hole! 2 @@"
8) lam @ (lam ! @@)
9) lam ! (lam @ @@)

Values of type "Hole! 1 @@"
10) lam @ @@

Values of type "Hole! 1 @@"
11) lam @ @@

Values of type "Hole! 1 @@"
12) lam @ @@

It is now apparent that the type "Hole! n @@" has exactly n values. Therefore we can simplify the values by using a simple number between 1 and n to represent the number of nesting layers between the "@" and the "@@".

Values of type "Hole! 2 @@"
8) 2
9) 1

Values of type "Hole! 1 @@"
10) 1

Values of type "Hole! 1 @@"
11) 1

Values of type "Hole! 1 @@"
12) 1

Finally, we have reached a stage where holes and positions do not play any role anymore. We simply have numbers counting the number of surrounding lambdas which lie between the variable and the position it references. This, it turns out, is precisely DeBruijn's notation.

Representation of "lam x (app (lam y y) x)"
13) lam ! (app (lam ! 1) 1)
(note that the two 1s refer to distinct positions)

Representation of "lam x (app x x)"
14) lam ! (app 1 1)

In conclusion, DeBruijn notation is both equivalent and simpler than the (properly restricted) notion of positions. That's why you should want to represent variables using deBruijn indices, not positions nor terms with holes.

leithaus said...

Way cool! Thank you for these posts! i'm going to have to study them carefully.

i surely agree with you that if i just want position, then i can do with somewhat less than the context type ∂T[N]. This was really just a placeholder in my thinking to spur me to come back and do further analysis.

The design direction is not to merely to find a way to tie the recursive knot on lambda variables in terms of lambda term structure. Once i found rho, i found plenty of ways to do that.

The vector of my thinking is genuinely about context and position. Here's a potentially motivating question. It turns out that there is no fully abstract encoding of ambient calculus into pi-calculus. Let us, however, consider a pi-calculus in which names are positions and/or contexts in pi-terms. If we then add term constructor to turn names back into locations we have a means of implementing the ambient in, out and open.

i suspect that we can use this as an intermediate language into which we can facilitate an encoding from pi-calculus to ambient calculus.

There are much more general desiderata i am pursuing; however, i'm a bit reluctant to explain the programme in this venue as it's very ambitious.

leithaus said...

Another quick point: the idea is quite generic. It is not restricted to lambda or pi-calculus. It works for any monad (that is "concrete" enough to essentially be a term algebra).

T[N] = [some term algebra]
N = ∂T[N]

This has more applications than the ones i thought of. A geometric topologist friend of mine was wondering if there was a homology theory hiding in this.

gelisam said...

I thought I understood what you wanted to do, but now I'm completely lost. I don't think I have the expertise you need after all, so, good luck on your ambitious project.