Should Decode accept U+FFFE or U+FFFF (and other Unicode non-characters)?

Jeff Walden jwalden+es at MIT.EDU
Thu Jul 14 22:38:01 PDT 2011


Reraising this issue...

To briefly repeat: Decode, called by decodeURI{,Component}, says to reject %ab%cd%ef sequences whose octets "[do] not contain a valid UTF-8 encoding of a Unicode code point".  It appears browsers interpret this requirement as: reject overlong UTF-8 sequences, and otherwise reject only unpaired or mispaired surrogate "code points".  Is this exactly what ES5 requires?  And if it is, should it be?  Firefox has also treated otherwise-valid-looking encodings of U+FFFE and U+FFFF as specifying that the replacement character U+FFFD be used.  And the rationale for rejecting U+FFF{E,F} also seems to apply to the non-character range [U+FDD0, U+FDEF] and U+xyFF{E,F}.  Table 21 seems to say only malformed encodings and bad surrogates should be rejected, but "valid encoding of a code point" is arguably unclear.

At least one person interested in Firefox's decoding implementation argues that not rejecting or replacing U+FFF{E,F} is a "potential security vulnerability" because those code points (particularly U+FFFE) might confuse code into interpreting a sequence of code points with the wrong endianness.  I find the argument unpersuasive and the potential harm too speculative (particularly as no other browser replaces or rejects U+FFF{E,F}).  But the point's been raised, and it's at least somewhat plausible, so I'd like to see it conclusively addressed.

A last note: two test262 tests directly exercise exercise the Decode algorithm and expect that these two characters decode to U+FFF{E,F}.  (I think at a glance they might also allow throwing, tho it's not clear to me that's intentional.)

http://hg.ecmascript.org/tests/test262/file/b4690e1408ee/test/suite/sputnik_converted/15_Native/15.1_The_Global_Object/15.1.3_URI_Handling_Function_Properties/15.1.3.1_decodeURI/S15.1.3.1_A2.4_T1.js
http://hg.ecmascript.org/tests/test262/file/b4690e1408ee/test/suite/sputnik_converted/15_Native/15.1_The_Global_Object/15.1.3_URI_Handling_Function_Properties/15.1.3.2_decodeURIComponent/S15.1.3.2_A2.4_T1.js

Jeff


More information about the es-discuss mailing list