[Agda] about serialization
Sergei Meshveliani
mechvel at botik.ru
Wed Apr 5 15:35:50 CEST 2017
Ulf, thank you for explanation.
On Wed, 2017-04-05 at 10:43 +0200, Ulf Norell wrote:
> [..]
>
> Note that deserialisation also takes a lot of time, i.e. loading the
> already checked modules is expensive, so your problems are not
> restricted to M.agda.
For the current state of my DoCon-A project, loading modules checked
earlier does not present a bottleneck. Because this small example is
reduced, and real expensive examples need > 12 Gb, and loading takes may
be 5% of the total cost.
> As you noticed from the profiling information almost all time is spent
> on
> serialisation and deserialisation. Given that you mostly have module
> application this likely means that your types are very big. You can
> inspect the types of everything in a module with the command C-c C-o.
> If you first turn on showing implicit arguments (C-c C-x C-h) you
> should be able to spot the problematic types.
>
This is interesting.
For example,
OfCCMonoid-1
has
pairValue : C × ℕ → C
pairValue = uncurry _^_
And I set
f = FtMonoid1.pairValue
to the head module M.
And I command C-c C-x C-h, C-c C-o M.f
to see the type obtained by substitution of the parts of the *upCMon-NZ
parameter into
OfCCMonoid-1.pairValue
This type can be defined as
--------------------------------------------------------
(let C =
.Relation.Binary.DecSetoid.Carrier (.Structures0.UpDSet.decSetoid
(.Structures0.UpMagma.upDSet
(.Structures0a.UpSemigroup.upMagma
(.Structures1.UpMonoid.upSemigroup
(.Structures1.UpCommutativeMonoid.upMonoid
(OverIntegralRing-0.*upCMon-NZ {α} {α=} upIR∣?)))))
)
in
{α α= : .Agda.Primitive.Level}
(upIR∣? : UpIntegralRing-with∣? α α=) →
Data.Product._×_ {α .Agda.Primitive.⊔ α=} {.Agda.Primitive.lzero}
C
.Agda.Builtin.Nat.Nat
→
C
)
-------------------------------------------------------
But in the report the expression of C is printed twice.
How the checker stores it internally?
There are probably more complex items in OfCCMonoid-1, and if things are
really copied this way, this indeed may lead to a great cost.
Regards,
------
Sergei
> / Ulf
>
> On Tue, Apr 4, 2017 at 4:29 PM, Sergei Meshveliani <mechvel at botik.ru>
> wrote:
> On Tue, 2017-04-04 at 12:02 +0200, Ulf Norell wrote:
> > On Sat, Apr 1, 2017 at 2:45 PM, Sergei Meshveliani
> <mechvel at botik.ru>
> > wrote:
> >
> > But anyway I need to understand: what generally the
> type
> > checker does in
> > the examples of this kind.
> > Here is a simplified example.
> >
> > module OfSemigroup is given an arbitrary semigroup
> H as a
> > parameter,
> > and it defines certain items using H.
> >
> > module OfRing is given an arbitrary ring R as a
> parameter.
> > It defines *-semigroup-2 -- a direct product of
> the
> > multiplcative
> > semigroup of R with itself.
> >
> > module M is given an arbitrary ring R as a
> parameter.
> > And its only implementation is
> > where
> > open OfRing R using (*-semigroup-2)
> > module OfMulSemigroup = OfSemigroup
> *-semigroup-2
> >
> > Suppose that M1 and M2 have been type-checked
> earlier.
> > When type-checking M,
> > * what is substituted to where?
> > * what is being normalized?
> > * what part can be expensive?
> >
> >
> > Nothing is normalised and the only type checking that
> happens is
> > to check that the given module arguments match the types
> expected
> > by the module. The arguments are substituted into the types
> of the
> > module definitions, but not the bodies. Nevertheless, this
> may create
> > very big types that take a long time to serialise. What does
> the
> > profile information tell you about where the time is spent?
> My money
> > would be on serialisation.
> >
>
>
> Thank you for explanation.
>
> In the real example, it reports for -M8G and 3 GHz machine:
>
> Total 122,391ms
> Miscellaneous 51,607ms
> UFtRing0 70,784ms
>
> Serialization 49,883ms (61,239ms)
> Deserialization 51,303ms,
>
> where M.agda = UFtRing0.agda,
> the first 3 lines are of profile.modules:10,
> the last 2 lines are of profile:7.
>
> I am trying to understand what is happening there so that
> takes a great
> cost.
> Let us, so far, consider only the cost of Serialization.
>
> For the above M.agda, I have in reality quite a small source:
>
> -------------------------------------------------------------------
> module UFtRing0 where
> open import Data.Product using (_,_)
>
> -- of application-----
> open import Structures3 using (UpIntegralRing;
> UpIntegralRing-with∣?)
> open import Structures4 using (module OverIntegralRing-0)
> open import Structures6 using (module OverIntegralRing-2)
> open import FtMonoid1 using (module OfCCMonoid-1)
> open import FtRing using (module OverIntegralRing-5)
>
> module UniqueFtRing-0 {α α=}
> (upIR∣? : UpIntegralRing-with∣? α α=)
> (open OverIntegralRing-2 upIR∣? using
> (Prime∣split))
> (open OverIntegralRing-5 upIR∣? using
> (FactorizationRing))
> (ftRing : FactorizationRing)
> (prime∣split : Prime∣split)
> where
> open OverIntegralRing-0 upIR∣? using (_∣'?_; *upCMon-NZ;
> *cancel-nz)
>
> module NZPack1 = OfCCMonoid-1 (*upCMon-NZ , _∣'?_)
> *cancel-nz
> ----------------------------------------------------------------------
>
> The last line "module NZPack1 = ..."
> takes 93 % of the type check cost. This is visible when we
> comment out
> this line.
>
> It remains to see what happens when type checking this last
> line.
>
> OfCCMonoid-1 is defined in previous .agda file:
>
> module OfCCMonoid-1 {α α=}
> (upCMon∣? : UpCommutativeMonoid-with∣? α α=)
> (let upCMon = proj₁ upCMon∣?)
> (open UpCommutativeMonoid upCMon using
> (setoid; _∙_))
> (cancel : LCancellation setoid _∙_)
> where ...
>
>
> (1) So, the type checker checks that *upCMon-NZ :
> UpCommutativeMonoid.
> This has been checked earlier, in the module
> OverIntegralRing-0.agda,
> so its Serialization cost here is zero
> Right?
>
> (2) The type checker processes all the signatures in the
> module
> OfCCMonoid-1, with substituting there the parameter
> values.
>
> The main parameter value is *upCMon-NZ,
> (the parameters _∣'?_ and *cancel-nz give a smaller cost,
> also they
> are considered similarly).
>
> *upCMon-NZ is imported from OverIntegralRing-0 as ready.
> This (nested record structure) is the instance of a generic
> structure of
> CommutativeMonoid for the case of the monoid of nonzero
> elements in the
> given IR : IntegralRing, for multiplication in IR. It may be
> a large
> term.
>
> And *upCMon-NZ is substituted into all signatures in
> OfCCMonoid-1.
> C = Carrier is extracted from upCMon∣?; the latter is
> replaced
> according to *upCMon-NZ, so that C is being replaced in
> with
> Nonzero in OfCCMonoid-1.
>
> For example, in
> pairValue : C × ℕ → C
> pairValue = uncurry _^_
>
> the signature is replaced with
> pairValue : Nonzero × ℕ → Nonzero,
>
> and the second line is not touched.
> Right?
>
> The signature
> pairToAsd : C × ℕ → Associated × ℕ
>
> is replaced with
> pairToAsd : Nonzero × ℕ → Associated-NZ × ℕ
>
> Here a notion of Associated is imported by
> open OfAssociated upCMon∣? using (Associated; ...),
>
> upCMon∣? is replaced there (in a previous .agda module) with
> *upCMon-NZ,
> and Associated is replaced there with a certain
> Associated-NZ,
> and this has been done when type checking previous .agda
> modules.
>
> In the definition
>
> record Factorization (a : C) : Set (α ⊔ α=)
> where
> constructor factorization′
>
> field multiset : Multiset
> allPrime : All IsPrime $ msKeys multiset
> productEq : a ~ msValue multiset
>
> Multiset is replaced with certain Multiset-NZ,
> IsPrime with IsPrime-NZ,
> Both these values are checked and ready in previous .agda
> modules.
>
> OfCCMonoid-1 is so that (according to your explanation) only
> the above
> trivial substitutions are done for the values prepared and
> checked in
> previous .agda files.
>
> Only the size of terms being substituted may be large.
> For example, *upCMon-NZ is a structure which contains a
> proof for that
> nonzero elements Nonzero in upIR : IntegralRing form a
> commutative
> monoid for multiplication, and also implementation for several
> operations for Monomial.
>
> These construct and proof are simple conceptually, but take
> about 150 nonempty lines of the source in the module devoted
> to
> substructures (sub-semigroup, sub-monoid, etc).
> I do not know: how the term for this *upCMon-NZ instance is
> stored in
> the memory.
> In the .agdai file, it may take much size (agdai files are
> large).
> But this refers to Deserialization, right?
> And symbolic substitutions with this term refer to
> Serialization.
> Right?
> I do not know how to find:
> which substitution is the most expensive here.
>
> Intuitively, noone can (probably) be expensive for
> serialization.
> Because a thing there is unfolded once in the memory, and then
> many
> references are copied (?).
> But I do not know how Agda performs these substitutions.
> May be,
> profile.definitions:10
> will help?
> The declaration
> module NZPack1 = OfCCMonoid-1 (*upCMon-NZ , _∣'?_)
> *cancel-nz
>
> leads to substituting of the parts of *upCMon-NZ into the
> definitions in
> the module OfCCMonoid-1 which is type-checked in
> previous .agda files.
> How to measure the cost of any of these substitutions?
>
> Thanks,
>
> ------
> Sergei
>
>
>
>
>
More information about the Agda
mailing list