[Agda] IO, semi-closed handles, and problems with extensional equality

Fri Sep 24 05:07:57 CEST 2010

Here's my attempt to say the same thing again with less braindumpiness...

   GOAL
   Allow stream transforms to be written in pure (ie non-IO) Agda,
   which use space proportional to their output.

As a example, here's a program which prints the number of bytes on its 
input, using the Bytestring.Lazy library:

   hGetContents >>= \lambda bs ->
     putStrLn (show (length bs))

which in Haskell runs in constant space due to the wonder of gc and lazy 
IO.  Now, we could get the same space usage using strict IO, but we'd 
need to rewrite the length function to have type:

   Handle -> (IO Int)

and moreover its implementation would be pretty painful.

I've been writing up a little library which allows this, which is based 
on commit points: the idea is that you can perform IO which has no 
visible side-effect (e.g. read) anytime, and you can schedule any IO to 
take place at a commit point, using an easy-to-implement (er, ignoring 
threading issues for the moment) API:

   onCommit : (IO Unit) -> (IO Unit)
   commit : IO Unit

Then we can implement safe versions of hGetContents and hPutContents as:

   hGetContents hdl =
     hGetContentsUNSAFE hdl >>= \lambda bs ->
       onCommit (force bs) >>
         return bs

   hPutContents hdl bs =
     force bs >> commit >> hPutStrUNSAFE hdl bs

assuming we can implement a function force : Bytestring -> IO Unit which 
forces its argument to be fully evaluated.  The function is almost just:

   force bs = return (deepSeq bs unit)

Unfortunately, this doesn't meet the goal, as this function keeps bs 
alive, and stops it from being garbage collected, which rather defeats 
the purpose of lazy IO!  I'm hoping there's a way round this, preferably 
not involving weak pointers and finalizers.

If I can get this up and running, then adding a little bit of dependency 
to ensure that IO is being used safely should be quite routine.  Hooray, 
Agda IO which is comparable in space usage to Haskell IO, but has the 
same semantics as strict IO.

A.

On 09/22/2010 05:44 PM, Jeffrey, Alan S A (Alan) wrote:
> I've been trying to work out if there's a way for us to have our strict
> semantics, but still eat our lazy IO cake.  Horrible gorey IO hacking
> follows...
>
> How about if the primitive for input in Agda IO (AIO) is:
>
>     aGetMatch : {Q : Set} ->  (DFA Q) ->  AHandle ->  (AIO String)
>
> where DFA Q is the set of DFAs with state set Q:
>
>     data DFA (Q : Set) : Set where
>       dfa : (Q ->  Bool) ->  (Q ->  Char ->  Q) ->  (DFA Q)
>
> from which we define the language of a DFA as per usual, and the
> semantics of aGetMatch is that it strictly returns the smallest input
> string in the language of its DFA.  From this, aGetLn and aGetContents
> are easy to define.  Moreover, there's a simple strict implementation of
> aGetMatch in terms of hGetChar.
>
> Now the cleverness: we implement a semi-closed AHandle as an HHandle
> together with a !(Ref (Maybe String)).  When a semi-closed AHandle is
> created, we initialise the ref with hGetContents hdl, that is it
> contains a lazy string.  When aGetMatch is called, we dereference the
> ref, lazily split the string, return the front, and set the ref to be
> the tail.  [Er, with a bit of thought to concurrency and making the
> read/write atomic.]
>
> Aaand... in the AIO monad we track all the outstanding semi-closed
> AHandles -- any time we hit a visible side-effecting computation (e.g.
> anything which flushes output) we go through the list and touch each ref
> -- this forces all the outstanding lazy strings to buffer themselves,
> and in particular block if necessary before the output is flushed.
>
> This delays strictifying the strings until a visible output event
> occurs, and minimizes input buffering (er, possibly at the cost of
> output buffering).
>
> This all seems kind of obvious, I suspect it's been done somewhere before...
>
> A.
>
> On 09/22/2010 09:19 AM, Jeffrey, Alan S A (Alan) wrote:
>> OMG, it's the early 1990s all over again :-)
>>
>> On 09/22/2010 03:10 AM, Lennart Augustsson wrote:
>>> I think Agda should abandon lazy IO (and thus semi-closed handles).
>>> They are a constant source of nuisance in Haskell and programs using
>>> them can only be explained by assuming that the IO monad gives totally
>>> random results.  You can explain them in the same way in Agda, I guess
>>> but it's not pretty.
>>
>> Unsurprisingly, I agree.  The semantics of Agda IO should be strict --
>> there may be some cleverness we can do under the hood with buffering
>> output to allow an implementation in terms of lazy IO, but that
>> shouldn't be exposed to the Agda programmer.
>>
>>> (Here's the argument why program₁ and program₂ can behave differently
>>> even if they are equivalent:
>>> Anything happening in the IO monad is random, so running program₁
>>> twice can result if different behaviour.
>>> So even if program₁ ≡ program₂ you can't expect running program₁
>>> necessarily giving the same result as program₂.)
>>
>> You can prove a lot of equivalences in the one point domain :-)
>>
>>>      -- Lennart
>>
>> A.
>>
>>>
>>> On Wed, Sep 22, 2010 at 4:30 AM, Alan Jeffrey<ajeffrey at bell-labs.com>    wrote:
>>>> Hi everyone,
>>>>
>>>> Starting a new thread, since we've stopped talking about FFIs a while ago.
>>>>     Old thread is here:
>>>>
>>>>     http://article.gmane.org/gmane.comp.lang.agda/1936
>>>>
>>>> I kicked the hornets nest and asked why IO is based on Costring rather than
>>>> String.  It seems to me that the culprit is semi-closed handles in Haskell,
>>>> as described here:
>>>>
>>>>     http://www.haskell.org/onlinelibrary/io.html#sect21.2.2
>>>>
>>>> "[After a call to hGetContents hdl], hdl is effectively closed, but items
>>>> are read from hdl on demand and accumulated in a special list returned by
>>>> hGetContents hdl."
>>>>
>>>> AFAICT, this is the underlying source of laziness in the IO library, the
>>>> other functions return results strictly, for example there's no problem with
>>>> hGetLine having the Agda type (Handle ->    IO String) as it waits until a NL
>>>> before returning a fully evaluated string.
>>>>
>>>> The discussion around IO and lazy strings made me realize there's a question
>>>> as to what the goals for the IO library should be.  I will posit an
>>>> innocent-looking goal, and then give a counterexample using the current IO
>>>> bindings :-)
>>>>
>>>>     GOAL:
>>>>     Extensionally equivalent programs have the same runtime behaviour.
>>>>
>>>> Obviously this is an informal goal, as the Haskell runtime system isn't
>>>> formalized, but at least we can do experiments.  For example, here are a
>>>> couple of programs which do not have the same runtime behaviour:
>>>>
>>>> -- This program blocks until it receives input
>>>>
>>>> hello₁ : Costring → IO Unit
>>>> hello₁ []       = putStrLn (fromString "hello")
>>>> hello₁ (x ∷ xs) = putStrLn (fromString "hello")
>>>>
>>>> program₁ : IO Unit
>>>> program₁ = getContents>>= hello₁
>>>>
>>>> -- This program prints immediately
>>>>
>>>> hello₂ : Costring → IO Unit
>>>> hello₂ x = putStrLn (fromString "hello")
>>>>
>>>> program₂ : IO Unit
>>>> program₂ = getContents>>= hello₂
>>>>
>>>> For extensional equivalence, I should probably define a proper equivalence
>>>> on IO, which includes the monad laws, but for right now I'll do a hack and
>>>> extend propositional equivalence with a postulate expressing that>>= is a
>>>> congruence (note, wrt extensional equivalence on the right):
>>>>
>>>>     postulate _⟨>>=⟩_ :
>>>>       {A B : Set} → {io₁ io₂ : IO A} → {f₁ f₂ : A → IO B} →
>>>>         (io₁ ≡ io₂) → (∀ a → f₁ a ≡ f₂ a) →
>>>>           (io₁>>= f₁) ≡ (io₂>>= f₂)
>>>>
>>>> Unfortunately, it is easy to show that (program₁ ≡ program₂), thus giving a
>>>> counterexample to the goal.  (The proof uses the snapshot 2.2.7 compiler,
>>>> but can probably be munged to go through in 2.2.6).
>>>>
>>>> Possible fixes:
>>>>
>>>> a) Abandon the goal (er, so what's the point of mechanized proof then?)
>>>>
>>>> b) Abandon semi-closed handles (ouch, buffering inefficencies!)
>>>>
>>>> c) Er, something really clever I've not thought of.
>>>>
>>>> Thoughts?
>>>>
>>>> A.
>>>>
>>>> _______________________________________________
>>>> Agda mailing list
>>>> Agda at lists.chalmers.se
>>>> https://lists.chalmers.se/mailman/listinfo/agda
>>>>
>>>>
>>
>