[Agda] Allow certain whitespace characters in identifiers

Andreas Abel abela at chalmers.se
Sat Apr 2 12:33:00 CEST 2016


:D  Good one!

Thanks for saving us from April 1st boredom!

On 01.04.2016 14:38, David Raymond Christiansen wrote:
> Dear Agda community,
>
> We are writing to announce the availability of a new extension to Agda:
> full support for reprogrammable Unicode whitespace. One of the best
> features of Agda is that we are able to flexibly adapt its syntax to the
> domains that we work in - for instance, instead of:
>
>      data Term : Context -> Type -> Set where
>        ...
>
> one can write:
>
>      data _⊢_ : Γ → τ → Set where
>        ...
>
> This makes Agda code far more accessible!
>
> Unfortunately, it was previously impossible to change the meaning of
> whitespace characters, or to use whitespace in the names of
> constructors. In our extension to Agda, we enable the following Unicode
> characters to be used like any other identifier or operator character:
>
> * The non-breaking space character ' '
> * The ideographic space character ' '
> * The medium mathematical space character ' '
> * The zero width non-breaking space character ''
> * The zero width joiner character '‍'
>
> Also, we relax a limitation in Agda's lexical syntax that disallowed the
> zero-width space character from being used as whitespace, which is
> essential when using custom notation (e.g. in our embedding of Edwin
> Brady's whitespace language, we can write " ​​ ​​ ​​ ​​ ​​ ​​ ​​
>   ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​‍" in our Agda buffer for
> a part of the Hello World program).
>
> As evidence for the utility of this extension, we present two
> demonstrations: a representation of untyped lambda terms using de Bruijn
> indices, and an embedding of the syntax of Whitespace.
>
> We represent our lambda terms as follows:
>
>      module LambdaTerms where
>        data Nat : Set where
>          z : Nat
>          s : Nat -> Nat
>
>        infixl 99999 _ _
>        infix 99 
>
>        data Tm : Set where
>          _ _ : Tm -> Tm -> Tm
>          ƛ‍_  : Tm -> Tm
>           : Nat -> Tm
>
>        self apply : Tm
>        self apply = ƛ‍​​z​ ​​z​
>
> The constructor for application is the infix use of the medium
> mathematical space, to reflect the mathematical nature of function
> application. Lambda terms are, as one would expect from the name, joined
> to their bodies via a zero-width joiner character, and their scopes
> space out with a non-breaking space. Because the de Bruijn indices
> should naturally appear to a part of the term, we inject them with a
> constructor whose name is the zero-width non-breaking space. Agda's need
> for spacing between operators can be satisfied with '​', preserving the
> readability of the embedding.
>
> The whitespace language is embedded as a list of commands:
>
>      module   where
>        data WS : Set where
>                : WS
>
>        infixr 100 __
>
>        data Program : Set where
>          ‍ : Program
>          __ : WS -> Program -> Program
>
>
>        helloworld : Program
>        helloworld =  ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ 
> ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​
>   ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​
> ​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​
> ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ 
> ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​
> ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​
> ​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​
> ​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ 
> ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​
>   ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​
> ​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​ ​​‍
>
> We have had to make some compromises for practicality - overridable
> newlines would be absolutely unreadable, after all. So we have chosen to
> use the non-breaking space, the ideographic space, and the mathematical
> medium space to represent space, tab, and newline, respectively. The
> commands are built in a list-like structure, using zero-width
> non-breaking spaces as cons and the zero-width joiner as nil.
>
> Recall that, in Emacs, one can enter arbitrary Unicode using C-x 8 RET
> CHARACTER NAME RET, where the CHARACTER NAME is the official all-caps
> Unicode designator for the character in question. Future work includes
> adapting Agda-mode to make it even easier to work with these encodings.
>
>
> Happily, this change did not even require major changes to Agda. We
> attach the patch to this email in hopes that others will find this
> extension as useful as we have when adopting syntax from the domains
> that we work in.
>
> Best,
>
> David Christiansen & Cameron Swords
>
>
> _______________________________________________
> Agda mailing list
> Agda at lists.chalmers.se
> https://lists.chalmers.se/mailman/listinfo/agda
>


-- 
Andreas Abel  <><      Du bist der geliebte Mensch.

Department of Computer Science and Engineering
Chalmers and Gothenburg University, Sweden

andreas.abel at gu.se
http://www2.tcs.ifi.lmu.de/~abel/


More information about the Agda mailing list