[Agda] result code performance

Sergei Meshveliani mechvel at botik.ru
Tue Sep 20 15:02:38 CEST 2016


Hi,

I have tested Development Agda of September 16, 2016
on my DoCon-A library 0.04.
Everything works. 

It remains to test the minimal memory expense for type-checking
(it is desirable to fit into 7 Gb).

Also I have for the first time directly compared the performance of the
result code for a real algebraic example to the performance of a similar
program written in Haskell and compiled by ghc-7.10.2.

{- Timing for `forPerformance'.   
   Agda of September 16, 2016, ghc-7.10,2, 3 GHz computer.

forPerformance  processes b^5 quintuples.

b  |       time [sec]
   |  Agda    |  DoCon-2.12.1 under ghc-7.10.2 
--------------|-------------------------------                          
10 |   1.2    |  0.16
12 |   3.0    |  0.3
14 |   7.3    |  1.0
16 |  14.8    |  2.0
18 |  27.5    |  3.0
20 |  48.3    |  5.0
-}

This is for arithmetic of  +, *, and partial division
in the domain  Z/(b)  of Integer modulo b,
b = 10, 12 ... 20.

Compared are  DoCon-A (in Agda, with all proofs) 
and  
DoCon-2.12.1 written in Haskell and compiled under ghc-7.10.2 under -O.

DoCon-2.12.1 uses the same algorithms, only without proofs.

Both methods are as follows.

1) The arithmetic in the residue domain Z/(m) is implemented for a
general EuclideanRing E, and Integer is supplied with the instance of
EuclideanRing.

2) Partial division in E/(b) is implemented via the extended GCD
algorithm.

3) For all quintuples (x , y , z , u , v)  over Z/(b)
   (b ^ 5 quintuples),
   the test program finds the quotient   

       rDecDiv ((x * x) + ((y * y) + (z * z))) (u + v).

   Then, all q in the results of kind  (yes q)  are summed up.
   Summing a list is by  foldr _+_. 

4) Arithmetic in E/(b) is so that all intermediate results are reduced
modulo b. So that there does not appear elements with norms greater than
|b|^2.

It is chosen  b < 21,  because DoCon-A uses unary coding.

The above benchmark shows that the cost order is the same: 
from b = 12 to b = 20 the time grows about 16 times.

And GHC is faster in a constant ratio of about 10. 
I think this is due to the difference between unary coding arithmetic in
Agda's Standard library and n-ary one of GHC.
For example,  12 * 12  costs 12 additions in DoCon-A and, probably, 3-4
machine instructions in GHC
(do I understand right of what is happening there ?).

Regards,

------
Sergei




More information about the Agda mailing list