[r6rs-discuss] [Formal] the CHAR? type
cowan at ccil.org
Fri Nov 17 14:40:35 EST 2006
Thomas Lord scripsit:
> The restriction in section 9.14, prohibitting the domain of
> INTEGER->CHAR from including surrogates, should be relaxed.
> Implementations should be permitted, not required, to adopt
> that restriction.
I'm against it.
> 1. In general, the less restricted model is simpler and more
> powerful. In an implementation without the restriction,
> the CHAR? type can simply be isomorphic with a set of
> exact integers in some (possibly improper) superset of
If you want u32 vectors, you know where to find them.
> That enables things like "bucky bits"
> (a fine lisp tradition).
Such a fine old tradition, in fact, that they were made optional in CLtL1
and removed altogether from CLtL2/ANSI CL. They were also accompanied
in CLtL1 by a type called "string-char", which implementations could
define as a subset of "char" that excluded some or all of the bucky bits.
Allowing arbitrary u32 values without creating a string-char type means
that at least one means of representing strings must be as a u32 vector.
Using the Unicode definition makes it possible to use UTF-8 or UTF-16
> It is certainly easy to teach learn. It seems to be simpler to
> implement, too.
So is weak typing a la C.
> 2. The I/O issues can be solved in a clever way -- by
> reinterpreting ill-formed UTF-8 and UTF-16 as spellings
> of sequences of certain private-use codepoints.
> Round-trips with processes that don't understand these
> private use characters are perfectly robust to the
> extent that those processes are conforming.
Those who try to reinvent Unicode, etc. There are several ways to resolve
ill-formed byte sequences: replace with U+FFFD, throw and exception,
ignore junk. This is just what is already provided.
But you, Wormtongue, you have done what you could for your true master. Some
reward you have earned at least. Yet Saruman is apt to overlook his bargains.
I should advise you to go quickly and remind him, lest he forget your faithful
service. --Gandalf John Cowan <cowan at ccil.org>
More information about the r6rs-discuss