[Agda] Termination and efficiency of higher order functions over nested datatypes

Wed Sep 28 04:34:37 CEST 2011

Hi everyone,

Something that's been niggling me for a while now is trying to prove 
termination of higher-order functions over nested datatypes in Agda. The 
simplest example is unranked trees:

  data Tree (A : Set) : Set where
    _∷_ : A → List (Tree A) → Tree A

which come with their mapping function:

  mapt : ∀ {A B} → (A → B) → Tree A → Tree B
  mapt f (a ∷ ts) = f a ∷ map (mapt f) ts

But this fails the termination checker, and instead you have to write:

  mutual

   mapt : ∀ {A B} → (A → B) → Tree A → Tree B
   mapt f (a ∷ ts) = f a ∷ mapts f ts

   mapts : ∀ {A B} → (A → B) → List (Tree A) → List (Tree B)
   mapts f [] = []
   mapts f (t ∷ ts) = mapt f t ∷ mapts f ts

In the past, this has just been a problem with code bloat, but now I'm 
hitting a nasty efficiency issue. In JavaScript, the native map function 
on arrays is much more efficient than anything you can write by hand 
(since it's in the ECMAScript 1.6 language spec, so is implemented in 
native code). So the definition of mapt in terms of map isn't just 
simpler, it's quite a bit faster too. (This happens in GHC too, due to 
the presence of fusions for map, fold, etc.)

Now, I could program round this by a two-pass algorithm which first 
finds the depth of the tree, then proceed by induction on depth, but my 
word that's a lot of work, and it turns a one-pass algorithm into a 
two-pass algorithm (and so can have a horrible impact on space usage of 
lazy data structures).

In an ideal world, there'd be some way to persuade the termination 
checker that map is a non-increasing function. Is such a thing feasible?

In a less ideal world, we could allow pragmas for optimizations, in this 
case something like:

   {-# REWRITE mapts (λ f → map (mapt f)) #-}

Of course such rewrites may be semantics-destroying. We could tame this 
by requiring proof of equivalence between the lhs and rhs, something like:

   mapt-rewr : ∀ {A B} f ts → (mapts {A} {B} f ts ≡ map (mapt f) ts)
   mapt-rewr = ...
   {-# REWRITE mapt-rewr #-}

This is much better behaved: it could introduce divergence, but 
otherwise (modulo postulates) should preserve semantics.

Currently, I'm reduced to writing raw JavaScript, which is not much fun.

Any thoughts?

A.