[r6rs-discuss] string->utfX and utfX-string questions

John Cowan cowan at ccil.org
Fri May 25 20:16:21 EDT 2007


Brian C. Barnes scripsit:

> 1.       String->utfX
> 
> a.       Should the resulting byte-vector contain byte order marks for utf16
> and utf32?

Definitely not.

> 2.       utfX->string
> 
> a.       What is the expected result if a user specifies (for example)
> little endian, but the bytevector itself contains byte marks for big endian?

The string will begin with the non-character \#xFFFE,
and the rest of it will be garbled.  (Don't do that, in other words.)

IMHO it would be better if the utf{16,32}->string functions were able
to take an additional argument specifying whether the endianness is
mandatory (BOM is treated as a character) or optional (if BOM is present,
believe it, otherwise use the endianness as a default).

-- 
In politics, obedience and support      John Cowan <cowan at ccil.org>
are the same thing.  --Hannah Arendt    http://www.ccil.org/~cowan



More information about the r6rs-discuss mailing list