Am I paranoid enough?

David-Sarah Hopwood david.hopwood at industrial-designers.co.uk
Mon Feb 16 07:16:58 PST 2009


Suppose that S is a Unicode string in which each character matches
ValidChar below, not containing the subsequences "<!", "</" or "]]>", and
not containing ("&" followed by a character not matching AmpFollower).
S encodes a syntactically correct ES3 or ES3.1 source text chosen by
an attacker.

  ValidChar :: one of
    '\u0009' '\u000A' '\u000D' // TAB, LF, CR
    [\u0020-\u007E]
    [\u00A0-\u00AC]
    [\u00AE-\u05FF]
    [\u0604-\u06DC]
    [\u06DE-\u070E]
    [\u0710-\u17B3]
    [\u17B6-\u200A]
    [\u2010-\u2027]
    [\u202F-\u205F]
    [\u2070-\uD7FF]
    [\uE000-\uFDCF]
    [\uFDF0-\uFEFE]
    [\uFF00-\uFFEF]

  AmpFollower :: one of
    '=' '(' '+' '-' '!' '~' '"' '/' [0-9]
    '\u0027' '\u005C' '\u0020' '\u0009' '\u000A' \u000D'
    // single quote, backslash, space, TAB, LF, CR

(ValidChar excludes format control characters, and some other
characters known to be mishandled by browsers. AmpFollower is
intended to exclude characters that can start an entity reference.)

S is inserted between "<script>" and "</script>" in a place where a
<script> tag is allowed in an otherwise valid HTML document, or
between "<script><![CDATA[" and "]]></script>" in a place where a
<script> tag is allowed in an otherwise valid XHTML document.
The HTML or XHTML document starts with a correct <!DOCTYPE or
<?xml declaration respectively, and is encoded as well-formed
UTF-8.


Are these restrictions sufficient to ensure that the embedded
script is interpreted as it would have been if referenced from
an external file, foiling any attempts of browsers to collude
with the attacker in misparsing it?

Are some of the restrictions unnecessary?

-- 
David-Sarah Hopwood ⚥



More information about the Es-discuss mailing list