[Agda] Does Agda2 have fast arrays?

Alan Jeffrey ajeffrey at bell-labs.com
Mon Oct 4 23:53:19 CEST 2010


On 10/04/2010 12:15 PM, Daniel Peebles wrote:
> Things like stream fusion, that vector (and lots of high-performance
> Haskell) relies on, rely on rewrite rules firing, which depend on the
> syntactic structure of terms. So throwing unsafeCoerces in, in the wrong
> places, will prevent the rules from firing, and would thus prevent
> deforestation.

Fair enough, Haskell's rules mechanism is just based on syntactic 
pattern-matching, so anything that introduces additional syntax is going 
to block the rewrites.  For some applications (e.g. the kinds of 
high-performance computing way back in the original post) this will be a 
deal-killer.

I've been trying to replicate this experimentally, just to see how bad 
the problem is, and what can be done about it, but I've hit a newbie 
roadblock.

A simple example is the one in the Stream Fusion paper:

module Main ( main ) where

import Prelude ( Int, (<=), (*), putStr, show )
import Data.Stream ( Stream, sum, concatMap, map, unfoldr,
   enumFromToInt )

f :: Int -> Int
f n = sum (concatMap (\ k -> map (\m -> k * m) (enumFromToInt 1 k))
   (enumFromToInt 1 n))

main = putStr (show (f 10000))

Problem is I can't get this to compile to code that's just in registers. 
  The output of -O3 -ddump-simpl includes:

Main.main_$s$wloop_sum1 =
   \ (sc_sDe :: Data.Stream.L GHC.Types.Int)
     (sc1_sDf :: Data.Stream.L GHC.Types.Int
                 -> Data.Stream.Step GHC.Types.Int (Data.Stream.L 
GHC.Types.Int))
     (sc2_sDg :: Data.Stream.Unlifted (Data.Stream.L GHC.Types.Int))
     (sc3_sDh :: GHC.Prim.Int#)
     (sc4_sDi :: GHC.Prim.Int#) ->
     case sc1_sDf sc_sDe of _ {
       Data.Stream.Yield x_art t'_aru ->
         case x_art of _ { GHC.Types.I# y_azj ->
         Main.main_$s$wloop_sum1
           t'_aru sc1_sDf sc2_sDg sc3_sDh (GHC.Prim.+# sc4_sDi y_azj)
         };
       Data.Stream.Skip t'_arx ->
         Main.main_$s$wloop_sum1 t'_arx sc1_sDf sc2_sDg sc3_sDh sc4_sDi;
       Data.Stream.Done -> Main.main_$s$wloop_sum sc3_sDh sc4_sDi
     }
end Rec }

which shows some unfused Step objects, and sure enough:

   Generation 0:  6976 collections,     0 parallel,  0.22s,  0.22s elapsed
   Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

Are there some compiler flags I'm missing?

A.

>
> On Mon, Oct 4, 2010 at 7:11 PM, Alan Jeffrey <ajeffrey at bell-labs.com
> <mailto:ajeffrey at bell-labs.com>> wrote:
>
>     On 10/04/2010 11:52 AM, Don Stewart wrote:
>
>         ajeffrey:
>
>             On 10/03/2010 12:59 AM, Don Stewart wrote:
>
>                 What can we do about the coercions? I suspect they make
>                 GHC sad.
>
>
>             Sorry, I'm being dense here -- what's the runtime penalty for
>             unsafeCoerce?  I'd expect that most compilers implement it
>             as a no-op.
>
>
>         It's a no-op, that generates a constraint type:
>
>              ActualType ~ CoerceType
>
>         And doing so creates a bit of an optimization boundary.
>
>             There may be cases where ghc's optimizer can't do as good a
>             job because
>             of unsafeCoerce, due to lack of type information, but I
>             doubt that would
>             have much of an impact in practice.
>
>
>     OK, so we're agreeing on what the problem is -- it's coercions
>     causing problems for the ghc optimizer, where we're differing is how
>     much impact it has in practice.  Do you have an example in mind of a
>     (preferably simple!) program where ghc relies heavily on type
>     information for optimzing?
>
>
>             This is a deliberately (honest!) not-very-clever sorting
>             algorithm which
>             generates lots of cons cells.  I ran it with 2G heap to
>             minimize the
>             impact of the GC (about 13% GC time).
>
>
>         But we'd like to use arrays (e.g.
>         http://hackage.haskell.org/package/vector) not lists, right?
>
>
>     Sigh, this is the problem with benchmarks, the response is always
>     "you picked the wrong example" :-)  I'll try reimplementing it using
>     vectors and see what happens.
>
>
>         I'm mostly interested in embeddings that take advantage of fast
>         "backends" from Hackage.
>
>
>     It sounds like we're looking for a particular class of backend
>     examples, though.  If the backend spends most of its time just
>     running Haskell code, with just a thin Agda coordination layer, then
>     Agda efficiency won't have much impact.  So we're worried about
>     cases where there's a lot of callbacks from the Haskell layer back
>     to the Agda layer?
>
>     A.
>
>     _______________________________________________
>     Agda mailing list
>     Agda at lists.chalmers.se <mailto:Agda at lists.chalmers.se>
>     https://lists.chalmers.se/mailman/listinfo/agda
>
>



More information about the Agda mailing list