[Agda-dev] agda: <<loop>>

Wolfram Kahl kahl at cas.mcmaster.ca
Thu Mar 12 23:41:50 CET 2015


Yesterday morning, Ulf declared Issue 1396 as fixed

   https://code.google.com/p/agda/issues/detail?id=1396

--- thanks a lot, Ulf!

That motivated me to get a current development version,
and try it on my developments.

(Before I actually get to try it on my developments,
 it appears that I need to find out exactly how
 ``the treatment of operators has changed'',
 and probably switch to the now steroids,
 but that is not what I am writing about now.)


For a long time, I have routinely linked Agda with -threaded,
and used -N5 to get decent speed-ups on garbage collection
on my six-core machines --- for the standard library,
it still gave about 5 seconds speedup on the four-minute run,
but for my larger developments, it was even more noticeable.

This has now changed quite drastically:
Checking the standard library with -N1 still takes close to four minutes:

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time  164.94s  (165.21s elapsed)
  GC      time   62.08s  ( 62.17s elapsed)
  EXIT    time    0.05s  (  0.05s elapsed)
  Total   time  227.10s  (227.45s elapsed)

  Alloc rate    1,136,580,628 bytes per MUT second

  Productivity  72.7% of total user, 72.5% of total elapsed

real    3m47.536s
user    3m46.050s
sys     0m1.120s

But adding cores makes it take more than twice as long:

-N2:

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time  459.35s  (459.98s elapsed)
  GC      time  113.89s  ( 61.40s elapsed)
  EXIT    time    0.05s  (  0.05s elapsed)
  Total   time  573.31s  (521.46s elapsed)

  Alloc rate    408,122,565 bytes per MUT second

  Productivity  80.1% of total user, 88.1% of total elapsed

real    8m41.544s
user    9m6.380s
sys     0m27.010s

---------------

  -N4:

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time  455.09s  (455.01s elapsed)
  GC      time  207.99s  ( 58.24s elapsed)
  EXIT    time    0.06s  (  0.06s elapsed)
  Total   time  663.16s  (513.33s elapsed)

  Alloc rate    411,948,428 bytes per MUT second

  Productivity  68.6% of total user, 88.7% of total elapsed

real    8m33.420s
user    9m38.720s
sys     1m24.530s



This looks a bit worse than the worst examples I have seen
in my experiments with parallelising interaction nets
(
  http://www.cas.mcmaster.ca/~kahl/Haskell/HINet/
  http://eptcs.web.cse.unsw.edu.au/paper.cgi?DCM2014:8
),
which makes me curious what is going on there.

I also tried adding

 -feager-blackholing

to the ghc-options in a different install ---
if actual parallelisation is happening,
this should reduce duplication of computation,
and is recommended for things that are intended to run in parallel.
(I haven't seen speed-ups with that yet,
 but I have seen a <<loop>> reported more precisely
 in my interaction net experiments.)
With Agda compiled with -feager-blackholing,
while checking the standard library with -N1, I get:

     Finished Data.Bool.Base.
 50374120   8877328   9098816  0.03  0.03    0.75    1.08    0    0  (Gen:  1)
agda: <<loop>>


Now this makes me really curious what is going on there!

(Some lack of laziness in the sharing implementation?)

Since I have not been able to closely follow Agda development recently:
What is the general context of current developments for these effects?



Wolfram




More information about the Agda-dev mailing list