[Agda] Sharing --- Q-combinators

Thu Oct 11 18:38:51 CEST 2012

I tried Q-combinators for the implementation of substitution in Agda. 
Unfortunately, it makes performance worse.  Here are results on the std-lib.

Without Q-comb:
---------------

  124,694,669,672 bytes allocated in the heap
   32,858,769,616 bytes copied during GC
      393,808,808 bytes maximum residency (116 sample(s))
       11,293,480 bytes maximum slop
             1096 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max 
pause
   Gen  0     239318 colls,     0 par   50.15s   50.23s     0.0002s 
0.0058s
   Gen  1       116 colls,     0 par   29.04s   29.11s     0.2509s 
0.8877s

   INIT    time    0.00s  (  0.00s elapsed)
   MUT     time  146.12s  (147.23s elapsed)
   GC      time   79.19s  ( 79.33s elapsed)
   EXIT    time    0.00s  (  0.00s elapsed)
   Total   time  225.32s  (226.56s elapsed)

   %GC     time      35.1%  (35.0% elapsed)

   Alloc rate    853,361,479 bytes per MUT second

   Productivity  64.9% of total user, 64.5% of total elapsed

With Q-combinators"
-------------------

  137,749,742,112 bytes allocated in the heap
   39,262,668,112 bytes copied during GC
      392,507,192 bytes maximum residency (117 sample(s))
       13,044,776 bytes maximum slop
             1061 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max 
pause
   Gen  0     264398 colls,     0 par   57.97s   58.02s     0.0002s 
0.0035s
   Gen  1       117 colls,     0 par   30.96s   31.01s     0.2650s 
0.9263s

   INIT    time    0.00s  (  0.00s elapsed)
   MUT     time  152.62s  (152.81s elapsed)
   GC      time   88.93s  ( 89.03s elapsed)
   EXIT    time    0.00s  (  0.00s elapsed)
   Total   time  241.55s  (241.84s elapsed)

   %GC     time      36.8%  (36.8% elapsed)

   Alloc rate    902,563,883 bytes per MUT second

   Productivity  63.2% of total user, 63.1% of total elapsed

The use of Q-combinator destroys lazyness, since because learning 
whether something is changed by an operation requires me to carry out 
the operation in full.

Also, I do not save on allocation, since basically I still copy the 
whole data structure I am traversing, even though I might discard it in 
the end.

In ML it might yield improvement, since ML is strict anyway.

But for Agda, it might only improve performance if the information that 
"nothing has changed" also allows us to skip some operations completely.

For instance, if we maintain the information about free variables in the 
term structure, we can skip recomputation of the free variables if 
nothing was changed by a substitution.

The question is also how often we get a "nothing changed".  Each de 
Bruijn shifting changes all of a term, substitutions that change nothing 
might be very rare.

I recorded and rolled back my changes.  Only in Agda.Utils.Update you 
can find my implementation of Q-combinators.

Cheers,
Andreas

On 11.10.2012 10:19, Andreas Abel wrote:
> On 11.10.12 10:12 AM, Peter Divianszky wrote:
>> On 11/10/2012 00:11, Andreas Abel wrote:
>>>
>>>
>>> On 10.10.12 10:02 PM, Peter Divianszky wrote:
>>>> On 10/10/2012 17:15, Andreas Abel wrote:
>>>>> On 10.10.2012 16:42, Wolfram Kahl wrote:
>>>
>>>>>> Central idea: Change |f : a -> a| to |f' : a -> Maybe a| returning
>>>>>>
>>>>>>    Nothing     for ``no change''
>>>>>>    Just x      for ``new value |x|''
>>>>>>
>>>>>> Then |qtry f' = f|, but |qtry f'| preserves more sharing, with:
>>>>>>
>>>>>> \begin{code}
>>>>>> type Q a = a -> Maybe a
>>>>>>
>>>>>> qtry :: Q a -> a -> a
>>>>>> qtry f x = maybe x id (f x)
>>>>>> \end{code}
>>>>>>
>>>>>>
>>>>>> I have put an old Haskell implementation of mine up at:
>>>>>>
>>>>>>     http://relmics.mcmaster.ca/~kahl/Haskell/QCombinators.lhs
>>>>>>
>>>>>> (and would be willing to contribute this to Agda if you find it
>>>>>> useful).
>>>>>
>>>>> I'd do this with a writer monad that writes to a "dirty" bit.  If
>>>>> dirty
>>>>> has not been set, we can discard the function result and use the
>>>>> original value.
>>>>>
>>>>> Use of Maybe seems a bit awkward since it is more naturally an error
>>>>> monad...
>>>>>
>>>>> Andreas
>>>>>
>>>>
>>>> These techniques would not be necessary if GHC would compile code which
>>>> returns
>>>>
>>>> r { x = y }
>>>>
>>>> as a reference to r if x and y points to the same memory location.
>>>
>>> Peter, what do you mean by this syntax "r { x = y }"? Looks like a
>>> record update to me, but then you should have said "if (x r) and y point
>>> to the same...location".
>>>
>>
>> Yes, you are right.
>> Given this optimization, lots of sharing would be saved and the
>> programmer should not use any trick just use more record updates.
>> Maybe one should ask for this at the GHC mailing list.
>
> Depending on how record updates are implemented, this could be easy or
> hard to change in GHC.  Looking at the emitted core should give a hint...
>
> If you ask at the GHC list, please CC me.
>
> Andreas
>

-- 
Andreas Abel  <><      Du bist der geliebte Mensch.

Theoretical Computer Science, University of Munich
Oettingenstr. 67, D-80538 Munich, GERMANY

andreas.abel at ifi.lmu.de
http://www2.tcs.ifi.lmu.de/~abel/