[Agda] about serialization
Sergei Meshveliani
mechvel at botik.ru
Tue Apr 4 16:29:37 CEST 2017
On Tue, 2017-04-04 at 12:02 +0200, Ulf Norell wrote:
> On Sat, Apr 1, 2017 at 2:45 PM, Sergei Meshveliani <mechvel at botik.ru>
> wrote:
>
> But anyway I need to understand: what generally the type
> checker does in
> the examples of this kind.
> Here is a simplified example.
>
> module OfSemigroup is given an arbitrary semigroup H as a
> parameter,
> and it defines certain items using H.
>
> module OfRing is given an arbitrary ring R as a parameter.
> It defines *-semigroup-2 -- a direct product of the
> multiplcative
> semigroup of R with itself.
>
> module M is given an arbitrary ring R as a parameter.
> And its only implementation is
> where
> open OfRing R using (*-semigroup-2)
> module OfMulSemigroup = OfSemigroup *-semigroup-2
>
> Suppose that M1 and M2 have been type-checked earlier.
> When type-checking M,
> * what is substituted to where?
> * what is being normalized?
> * what part can be expensive?
>
>
> Nothing is normalised and the only type checking that happens is
> to check that the given module arguments match the types expected
> by the module. The arguments are substituted into the types of the
> module definitions, but not the bodies. Nevertheless, this may create
> very big types that take a long time to serialise. What does the
> profile information tell you about where the time is spent? My money
> would be on serialisation.
>
Thank you for explanation.
In the real example, it reports for -M8G and 3 GHz machine:
Total 122,391ms
Miscellaneous 51,607ms
UFtRing0 70,784ms
Serialization 49,883ms (61,239ms)
Deserialization 51,303ms,
where M.agda = UFtRing0.agda,
the first 3 lines are of profile.modules:10,
the last 2 lines are of profile:7.
I am trying to understand what is happening there so that takes a great
cost.
Let us, so far, consider only the cost of Serialization.
For the above M.agda, I have in reality quite a small source:
-------------------------------------------------------------------
module UFtRing0 where
open import Data.Product using (_,_)
-- of application-----
open import Structures3 using (UpIntegralRing; UpIntegralRing-with∣?)
open import Structures4 using (module OverIntegralRing-0)
open import Structures6 using (module OverIntegralRing-2)
open import FtMonoid1 using (module OfCCMonoid-1)
open import FtRing using (module OverIntegralRing-5)
module UniqueFtRing-0 {α α=}
(upIR∣? : UpIntegralRing-with∣? α α=)
(open OverIntegralRing-2 upIR∣? using (Prime∣split))
(open OverIntegralRing-5 upIR∣? using (FactorizationRing))
(ftRing : FactorizationRing)
(prime∣split : Prime∣split)
where
open OverIntegralRing-0 upIR∣? using (_∣'?_; *upCMon-NZ; *cancel-nz)
module NZPack1 = OfCCMonoid-1 (*upCMon-NZ , _∣'?_) *cancel-nz
----------------------------------------------------------------------
The last line "module NZPack1 = ..."
takes 93 % of the type check cost. This is visible when we comment out
this line.
It remains to see what happens when type checking this last line.
OfCCMonoid-1 is defined in previous .agda file:
module OfCCMonoid-1 {α α=}
(upCMon∣? : UpCommutativeMonoid-with∣? α α=)
(let upCMon = proj₁ upCMon∣?)
(open UpCommutativeMonoid upCMon using (setoid; _∙_))
(cancel : LCancellation setoid _∙_)
where ...
(1) So, the type checker checks that *upCMon-NZ : UpCommutativeMonoid.
This has been checked earlier, in the module OverIntegralRing-0.agda,
so its Serialization cost here is zero
Right?
(2) The type checker processes all the signatures in the module
OfCCMonoid-1, with substituting there the parameter values.
The main parameter value is *upCMon-NZ,
(the parameters _∣'?_ and *cancel-nz give a smaller cost, also they
are considered similarly).
*upCMon-NZ is imported from OverIntegralRing-0 as ready.
This (nested record structure) is the instance of a generic structure of
CommutativeMonoid for the case of the monoid of nonzero elements in the
given IR : IntegralRing, for multiplication in IR. It may be a large
term.
And *upCMon-NZ is substituted into all signatures in OfCCMonoid-1.
C = Carrier is extracted from upCMon∣?; the latter is replaced
according to *upCMon-NZ, so that C is being replaced in with
Nonzero in OfCCMonoid-1.
For example, in
pairValue : C × ℕ → C
pairValue = uncurry _^_
the signature is replaced with
pairValue : Nonzero × ℕ → Nonzero,
and the second line is not touched.
Right?
The signature
pairToAsd : C × ℕ → Associated × ℕ
is replaced with
pairToAsd : Nonzero × ℕ → Associated-NZ × ℕ
Here a notion of Associated is imported by
open OfAssociated upCMon∣? using (Associated; ...),
upCMon∣? is replaced there (in a previous .agda module) with *upCMon-NZ,
and Associated is replaced there with a certain Associated-NZ,
and this has been done when type checking previous .agda modules.
In the definition
record Factorization (a : C) : Set (α ⊔ α=)
where
constructor factorization′
field multiset : Multiset
allPrime : All IsPrime $ msKeys multiset
productEq : a ~ msValue multiset
Multiset is replaced with certain Multiset-NZ,
IsPrime with IsPrime-NZ,
Both these values are checked and ready in previous .agda modules.
OfCCMonoid-1 is so that (according to your explanation) only the above
trivial substitutions are done for the values prepared and checked in
previous .agda files.
Only the size of terms being substituted may be large.
For example, *upCMon-NZ is a structure which contains a proof for that
nonzero elements Nonzero in upIR : IntegralRing form a commutative
monoid for multiplication, and also implementation for several
operations for Monomial.
These construct and proof are simple conceptually, but take
about 150 nonempty lines of the source in the module devoted to
substructures (sub-semigroup, sub-monoid, etc).
I do not know: how the term for this *upCMon-NZ instance is stored in
the memory.
In the .agdai file, it may take much size (agdai files are large).
But this refers to Deserialization, right?
And symbolic substitutions with this term refer to Serialization.
Right?
I do not know how to find:
which substitution is the most expensive here.
Intuitively, noone can (probably) be expensive for serialization.
Because a thing there is unfolded once in the memory, and then many
references are copied (?).
But I do not know how Agda performs these substitutions.
May be,
profile.definitions:10
will help?
The declaration
module NZPack1 = OfCCMonoid-1 (*upCMon-NZ , _∣'?_) *cancel-nz
leads to substituting of the parts of *upCMon-NZ into the definitions in
the module OfCCMonoid-1 which is type-checked in previous .agda files.
How to measure the cost of any of these substitutions?
Thanks,
------
Sergei
More information about the Agda
mailing list