[Agda] Re: [Coq-Club] Re: Agda beats Coq (was: Termination proof in partiality monad)

Sun Nov 23 04:25:51 CET 2008

On Saturday 22 November 2008 1:23:52 pm Aaron Bohannon wrote:
> I have been trying to working coinductive data and definitions
> recently and have found that Coq been very helpful in developing
> intuition about this sort of thing.  So I found your remarks below
> very interesting.  Specifically, you seem to imply that a
> CoInductive/codata definition in Coq/Agda permits some operations that
> an abstract type with an "observe" function would not permit.  Can you
> (or anyone else) be more specific about what these would be?

(It should be noted that I have experience only with Agda's codata, so I'm 
simply assuming that Coq's CoInductive is about the same.)

I wouldn't say that one permits operations that the other doesn't. I'd rather 
say that the intuitions promoted by one are better or worse for helping write 
correct proofs/reasoning about your programs. Specifically, it leads one (me at 
least) to think that various expressions involving codata should behave like 
their syntactically-almost-identical analogues with data/Inductive, *but they 
don't*. Perhaps the biggest problem is using the codata coconstructors (for 
lack of a better term) as if they were data constructors.

Consider the type of natural numbers:

    data Nat : Set where
      zero : Nat
      suc  : Nat -> Nat

And suppose we have:

  i, j, k : Nat 
  j = suc i
  k = suc i

And we want to know if j == k. Well, this is easy, because we see that j and k 
were both built using i and the suc constructor. That's how inductive data 
works, two values are the same if they were *built* the same way.

So now we turn to the codata analogue:

    codata Conat : Set where
      cozero : Conat
      cosuc  : Conat -> Conat

And we consider:

  i, j : Conat
  i ~ cosuc i -- ~ is used in Agda instead of = for corecursive definitions
  j ~ cosuc i

And we want to know if i and j are the same. Now, the syntax here might lead 
one (me certainly) to think that this is just trivial. i and j are *built* the 
same way, so they must be the same, just like inductive data. However, that's 
*not* how codata works. Two coinductive values are the same if *they produce 
the same series of observations*. So, proving i and j are the same is no 
longer trivial, and you have to talk about how one is a bisimulation of the 
other and such.

If we look at the categorical semantics of the two types, we may figure out 
where things are going wrong. Inductive types can be given initial algebra 
semantics, where Nat above goes something like:

  N X = 1 + X is a functor

  (A, a : N A -> A) is an algebra for that functor

  Nat is an initial object in the category of such algebras, which means that
  for any other algebra that fits the above form, there is a unique morphism
  from the Nat algebra to it. This looks like (using Agda for notation):
    cataNat :  {A : Set} -> A -> (A -> A) -> Nat -> A

And "cataNat" should be a pretty familiar function:

  cataNat z s zero    = z
  cataNat z s (suc n) = s (cataNat z s n)

So, just to be explicit, using inductive constructors to build inductive 
values corresponds to applying the morphism of the algebra, and pattern 
matching corresponds to using the unique morphism whose existence was proven 
by initiality. Now, consider coinductive types, which are given by terminal 
coalgebra semantics:

  C X = 1 + X is a functor

  (A, a : A -> C A) is a coalgebra for that functor

  Conat is a terminal object in the category of such algebras, which means
  that for any other coalgebra that fits the above form, there is a unique
  morphism from it to Conat. This looks like:
    anaConat : {A : Set} -> (A -> 1 + A) -> A -> Conat

Which looks like:

  anaConat f a with f a
  ... | inl () ~ cozero
  ... | inr a' ~ cosuc (anaConat f a')

However, there's a subtle difference in the way this corresponds to the 
categorical semantics that differs from the inductive case. While in the 
inductive case, application of the algebra morphism corresponded to building a 
value with a constructor, in the coinductive case, *it corresponds to 
destructing a coinductive value and looking at it*. In other words, pattern 
matching. In short, one might say that for inductive data, building is simple, 
and destroying is complex, whereas the opposite is true for coinductive 
codata.

And, I think that the coinductive syntax is misleading on this front, when you 
see:

  j ~ cosuc i

You may think, "j is built from an i and a cosuc constructor", but *this is 
not the case*. Rather the "cosuc" used as a constructor is better thought of 
as something like:

  cosuc : Conat -> Conat
  cosuc n = anaConat f (Just n)

  f : Maybe Conat -> 1 + Maybe Conat
  f (Just n) with observe n
  ... | inl () = inr Nothing
  ... | inr n' = inr (Just n')
  f Nothing = inl ()

Which has the net effect of producing one additional 'inr' observation beyond 
whatever the original Conat produced. Of course, this is probably overly 
elaborate, but the point is that the value in question cannot simply be 
thought of as trivially as the previous value with another constructor box 
wrapped around it.

(I realize this is probably confusing, since I just defined anaConat in terms 
of cozero and cosuc. It might help to look at Anton Setzer's proposed 
coalgebraic syntax for corecursive functions, where instead of working with 
pseudo-constructors like cosuc, you simply describe the observations you see. 
Then you can define the pseudo-constructors and corecursion in terms of this 
primitive:

  cozero *     = inl ()
  (cosuc n) * = inr n

  (anaConat f a) * = bimap id (anaConat f) (f a)

where:

  bimap : {a b c d : Set} -> (a -> c) -> (b -> d) -> a + b -> c + d
  bimap f _ (inl a) = inl (f a)
  bimap _ g (inr b) = inr (g b)

Hopefully that's clear.)

And, of course, this explains why indexing with specialized codata 
'constructors' in a family doesn't seem to work very well, because:

  data Foo : Conat -> Set where
    bar : {c : Conat} -> Foo c -> Foo (cosuc c)

Is not simply working with constructors. It'd be akin to writing something 
like:

  data Baz : Nat -> Set where
    blorf : {n : Nat} -> Foo n -> Foo (quux n)

Where quux is some arbitrary function. Except the situation is probably even 
worse, because normalization of quux n probably helps (I'm not exactly sure 
what the dual situation situation with inductive data is. Perhaps Foo (quux n) 
-> Foo n, where you'd have to compute the inverse of quux, or otherwise prove 
something onerous merely to construct a value), whereas you can't simply turn 
a coinductive value into some normal form consisting of only constructors 
(since it's potentially infinite).

So, to sum up, it seems to me that codata/CoInductive encourages one to think 
about coinductive values in the same way as inductive values despite that that 
leads to mistakes. The coalgebraic formulation and explicit observation (or 
not; Anton Setzer explained how to think of the current syntax in terms of 
his, and Nils Anders Danielsson showed here earlier how you can use pattern 
matching on coinductive values in lieu of a destruction operation, since 
pattern matching is the codata operation that corresponds to the coalgebra 
morphism) is closer to the actual semantics, and seems to stop these think-os 
from happening (rather than fixing up an inductive-like use with some 
explicitly placed evaluation, like you see in some of the original proofs for 
the partiality monad).

Anyhow, I think that's about the end of my rambling. Hopefully that was useful 
to you or someone else, but even if not, it helped me to get my thoughts on 
the situation in order. As a caveat, I should note that the categorical 
semantics above are only good for inductive and coinductive types, not 
inductive and coinductive families and such that you'll find in Coq and Agda. 
The relevant literature for the latter is a bit above my head at the moment. 
Nevertheless, I think it's good enough to elucidate the source of the problems 
people are having currently.

Also, if anything I said above was incorrect, I hope someone will correct me. 
:)

Cheers,
-- Dan