[r6rs-discuss] Stateful codecs and inefficient transcoding
per at bothner.com
Mon Oct 30 15:30:41 EST 2006
William D Clinger wrote:
> In other words, you can expect get-char to deliver
> about half the performance of C's getc when reading
> characters one at a time.
In many cases, sure. But consider a call to get-char. It cannot
"transcode-ahead", since the next call could be a get-u8. Well,
it probably could, as long as it can back up if need by, but
that could make for a complicated implementation. (One issue is
ignoring errors when trancoding-ahead.)
Converting "simple" decodings can be done directly in the
Scheme implementation. By "simple" I mean UTF-8, Latin-1,
utf16be, and utf16le.
More complex table-driven decoding would be ridiculous
to do in Scheme. Not a priori, but because it makes much more
sense to use existing libraries, such as iconv.
So you really have to explain how you would implement
character decoding using iconv while still being only
twice as slow as C, and allowing a get-char to be followed
by a get-u8.
Recommmendation: allow get-char/read/... after get-u8/get-bytes-n/...
but do not require an implementation to support the converse
(reading bytes after reading characters). Or only require support
for reading bytes after reading characters for a few simple
standard encodings - primarily UTF8.
per at bothner.com http://per.bothner.com/
More information about the r6rs-discuss