es-discuss Digest, Vol 81, Issue 82

Bjoern Hoehrmann derhoermi at gmx.net
Mon Nov 18 09:50:18 PST 2013


* mnita at google.com wrote:
>The first four bytes are:
>
>           00 00 00 22  UTF-32BE
>           00 22 E5 65  UTF-16BE
>           22 00 00 00  UTF-32LE
>           22 00 65 E5  UTF-16LE
>           22 E6 97 A5  UTF-8
>
>The UTF-16 bytes don't match the patterns in RFC, so UTF-16 streams would
>(wrongly) be detected as UTF-8, if one strictly follows the RFC.

RFC 4627 does not allow string literals at the top level.
-- 
Björn Höhrmann · mailto:bjoern at hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 


More information about the es-discuss mailing list