[r6rs-discuss] [Formal] U+FFFD not intended for encoding errors
John Cowan
cowan at ccil.org
Fri Sep 22 18:27:34 EDT 2006
Marcin 'Qrczak' Kowalczyk scripsit:
> "For example, in UTF-8 every code unit of the form 110xxxx must be
> followed by a code unit of the form 10xxxxxx. A sequence such as
> 110xxxxx 0xxxxxxx is illformed and must never be generated. When
> faced with this ill-formed code unit sequence while transforming or
> interpreting text, a conformant process must treat the first code unit
> 110xxxxx as an illegally terminated code unit sequence for example,
> by signaling an error, filtering the code unit out, or representing
> the code unit with a marker such as U+FFFD replacement character."
Good catch. I withdraw my comment.
--
Real FORTRAN programmers can program FORTRAN John Cowan
in any language. --Allen Brown cowan at ccil.org
More information about the r6rs-discuss
mailing list