[Agda] Nat arithmetic performance
Ulf Norell
ulf.norell at gmail.com
Sat Sep 24 16:36:18 CEST 2016
On Sat, Sep 24, 2016 at 12:52 PM, Sergei Meshveliani <mechvel at botik.ru>
wrote:
> People,
>
> Please, find attached a certain test for performance of arithmetic in
> Agda, as compared to ghc-7.10.2.
>
> Shortly saying, it is
> sum [x*x - y*y | x, y <- [n .. 0]],
> n = 1000, ..., 6000, ...
>
> The Hasell program is also there.
>
> {-
> --------------------------------------------------------------------
> Timing for Agda of September 16, 2016, ghc-7.10.2, 3 GHz computer
>
> n | s (control) | time [sec]
> | | Agda ghc-7.10.2 -O
> --------------------------------------------------
> 1000 | 166499833500 | 0.3 | 0.04
> 2000 | 2665332667000 | 1.3 | 0.2
> 3000 | 13495498500500 | 3.0 | 0.4
> 4000 | 42655997334000 | 5.1 | 0.8
> 5000 | 104145829167500 | 8.0 | 1.6
> 6000 | 215963994001000 | 11.9 | 2.5
> -}
>
> The cost order of n^2 is as needed.
> Bug GHC is somewhat 5 times faster.
> I think, this factor of 5 is not so important.
> But still, what may be its source?
>
You get a clue that something is wrong if you run with +RTS -s:
$ ./NatArithPerformance +RTS -s
n = 5000
s = 104145829167500
8,616,354,344 bytes allocated in the heap
6,531,203,072 bytes copied during GC
1,593,171,592 bytes maximum residency (13 sample(s))
4,362,336 bytes maximum slop
2772 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max
pause
Gen 0 15734 colls, 0 par 3.300s 3.427s 0.0002s
0.0016s
Gen 1 13 colls, 0 par 2.329s 3.367s 0.2590s
1.5709s
INIT time 0.000s ( 0.003s elapsed)
MUT time 3.664s ( 3.460s elapsed)
GC time 5.629s ( 6.794s elapsed)
EXIT time 0.029s ( 0.275s elapsed)
Total time 9.324s ( 10.531s elapsed)
%GC time 60.4% (64.5% elapsed)
Alloc rate 2,351,612,899 bytes per MUT second
Productivity 39.6% of total user, 35.1% of total elapsed
The key number here is total memory in use: 2772MB. You'll find that the
Haskell program
runs in constant space. In this case your problem is sum:
sum : List ℕ → ℕ
sum [] = 0
sum (n ∷ ns) = n + (sum ns)
For long lists this will build up a huge thunk of +s in memory before it
can start evaluating. If you change it to
open import Agda.Builtin.Strict
foldl' : ∀ {a b} {A : Set a} {B : Set b} → (B → A → B) → B → List A → B
foldl' f z [] = z
foldl' f z (x ∷ xs) = primForce (f z x) λ z' → foldl' f z' xs
sum : List ℕ → ℕ
sum = foldl' _+_ 0
we get instead
n = 5000
s = 104145829167500
8,201,151,016 bytes allocated in the heap
3,231,768 bytes copied during GC
195,104 bytes maximum residency (2 sample(s))
48,648 bytes maximum slop
2 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max
pause
Gen 0 15848 colls, 0 par 0.039s 0.049s 0.0000s
0.0009s
Gen 1 2 colls, 0 par 0.001s 0.010s 0.0049s
0.0096s
INIT time 0.000s ( 0.002s elapsed)
MUT time 3.335s ( 3.393s elapsed)
GC time 0.039s ( 0.059s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 3.376s ( 3.454s elapsed)
%GC time 1.2% (1.7% elapsed)
Alloc rate 2,459,180,636 bytes per MUT second
Productivity 98.8% of total user, 96.6% of total elapsed
On my machine that's within 25% of the corresponding Haskell program (which
runs in 2.7s).
/ Ulf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.chalmers.se/pipermail/agda/attachments/20160924/78b85cca/attachment.html
More information about the Agda
mailing list