quasi-literal strawman

Mike Samuel mikesamuel at gmail.com
Sat Dec 19 10:12:25 PST 2009


Oh, and on the integration with document.write and innerHTML and the
like, I was not going to ask the W3C to recognize a particular library
or object type, the proposal was going to look like:

Currently document.write coerces its arguments to strings.  Change
that so that if the input is non-primitive, invoke toString passing
the name of the current insertion mode as defined in
http://www.w3.org/TR/html5/syntax.html#insertion-mode and similarly
for innerHTML.

So
   <table><tr><script>
   document.write(foo);
   </script></tr></table>
would end up calling foo.toString("in row") and coercing the result to
a string, and
   myTextArea.innerHTML = bar
would end up calling bar.toString("in RAWTEXT/RCDATA") and coercing
the result to a string to get the actual innerHTML.

So I think the separation between library logic and standards is
pretty well-defined for both proposals.

I still have to write that up though.





2009/12/19 Mike Samuel <mikesamuel at gmail.com>:
> 2009/12/18 Andy Chu <andy at chubot.org>:
>>>> So then my question is why it needs to specify a desugaring.  Why is a
>>>> quasi-literal not a string?
>>>
>>> I still don't understand the question?
>>>
>>> Why `foo$bar` and not "foo$bar"?  Well, the latter doesn't do anything
>>> useful with the expression (bar).
>>
>> OK, so to back up a bit, the point of quasi-literals is to shift the
>> burden of escaping from the application developer to the library
>> author (e.g. authors of the DOM API).  Is that an accurate and
>> reasonably complete summary?  (If it is I might suggest making "safe
>> string interpolation" the feature name)
>
> "Safe String Interpolation" is the name of a quasi scheme that I would
> also like to propose.  And its goal is as you describe.
>
>
>> My question is why the entire quasi-literal scheme can't be in a
>> library.  In your doc you have something like:
>
> It can be done in a library but some syntactic sugar will make it much
> more usable.
>
>
>> new StringInterpolation(["SELECT * FROM TABLE WHERE name='", name, "'
>> AND modified > ", new Date(d)])
>>
>> Does this variation express all of what quasi-literals can do?
>
> Since "new StringInterpolation" could be defined to do anything that a
> single user call can do, yes.
>
>
>> var sqlStatement = new StringInterpolation("SELECT * FROM TABLE WHERE
>> name=$name AND modified > $date", {name: name, date: new Date(d)})
>>
>> sqlLibrary.execute(sqlStatement)  // does "autoescaping" of name and date
>>
>> The syntax of the first string argument is the quasi-literal syntax in
>> your doc, with $, {}, etc.
>>
>> Reasons why I ask:
>> - People could use this scheme *now* if it were a library, and start
>> changing their APIs to autoescape.
>
> That's one reason to use desugaring.  People could use it now, and use
> a tool that does the desugaring, as long as the library is written in
> the intersection of ES5 strict and ES3.
>
>> - Quasi-literals have a bit of a meta-problem.  You're lamenting the
>> complex escaping rules of HTML (rightly), but then this proposal adds
>> a third escaping mode to JS, which is probably the most
>> programmatically-generated language on the planet (since it must go
>> over the network).  Certainly I'm already confused by the discussions
>> of escaping quasi-literals on this thread.
>
> This shouldn't affect existing correct code generators since, if they
> generate code with backquoted strings, they can't be correct.
>
>
>>> PHP and JSP were the gold standard when I built it, and Django and
>>> others have addressed that to some degree.
>>> Do you know of any statistics on how much PHP code is running versus
>>> Django code?
>>
>> I don't know, but certainly tons more PHP code.  BTW Smarty also does
>> this and I think this is the most common PHP templating language now:
>> http://www.smarty.net/manual/en/language.modifier.escape.php
>> (I don't understand why a templating language needs a templating
>> language, but I'll never understand PHP I suppose)
>
> I think it's partly because people want to piece-wise migrate away
> from PHP's easy-to-write-almost-correct-code,
> very-hard-to-write-correct-code string interpolation scheme.
>
>> I think the quasi-literals are a cool idea -- but they're also a
>> pretty large innovation.  This scheme is not used by "production"
>> library or language that I know of.  The point being that innovation
>> in standards has the problems that Douglas Crockford has spoken about.
>
> I absolutely agree and I'm not proposing we innovate in the standard.
> That's why I hoped to standardize on a simple desugaring that is
> syntactically familiar to users of existing languages, and that will
> allow experimentation by library authors who, as you point out, have
> the best track record of successful innovation.
>
>> This would shift the boundaries between JavaScript and the DOM, and
>> make them an anomaly among literally hundreds of other
>> libraries/languages.  Template languages are a well known commodity by
>> now.
>
> I don't understand what you mean by shifting the boundaries.  Strings
> are not the de facto standard for moving data across module
> boundaries.  There are many languages that pass around structured
> content.
>
>> I would also say that the biggest boundary is actually getting people
>> to write auto-escaping.
>
> I'm not sure I understand.  I never said I wanted a bunch of people
> writing auto-escaping code.  It needs to be done by at least one
> library.
> As for clients, many people write templates using a variety of
> syntaxes.  The "auto" in "auto-escaping" means they just have to do in
> JS what they now do in other languages.
>
>> It's not the lack of syntax in the JS language.
>
> I disagree.  I find it much easier to knock out code in perl or PHP
> that is almost right.
>
>
>> I didn't study the end of your doc, but it certainly isn't
>> a simple problem for someone to write HTML autoescaping.  Considering
>> how much HTML is malformed out there, I don't see how there won't be
>> holes 0.1% or 0.01% of the time due to heuristics.
>
> I believe I can demonstrate that it is that simple, but that is out of
> scope for this thread.
> If we can reduce XSS, the single largest source of vulnerabilities in
> web applications, to .1% or .01% of its current size then I think that
> will have been worth this change alone.
>
>> I would rather use my templating language and the equivalent of auto-escaping via static
>> analysis to get it 100% right (since a static analysis tool can issue
>> warnings where there is ambiguity and a human applies their judgement,
>> to fix their code before it has to run).
>
> I am skeptical that there is an easy migration path for existing JS
> code to something that is amenable to the kind of static analysis you
> describe, but I would love to be proven wrong.  I don't see anything
> on the JSON templates page about static analysis though.  Am I missing
> something?  Or is that more for the other languages in which it can be
> embedded than JS?
>
> By adding a bit of generally useful syntactic sugar, your templating
> scheme, secure string interpolation, and others can compete as
> libraries.
>
>
>> JSON Template is trivially tokenizable for this reason: \{.*?\} (or
>> e.g. \[.*?\] depending on the delimiter) splits it into literals and
>> substitutions.  Then you can put the work into autoescaping rather
>> than parsing the template language (which always sucks because of
>> escaping!).
>>
>>> What is locals()?  Why does expandTemplate need access to all locals
>>> to do its job instead of just the specified ones?
>>> Does this suffer from the "formatting string from untrusted source"
>>> problem that python suffers from, and the "substitution value from
>>> untrusted source" problem?
>>
>> That was just a shorthand to get around the verbosity of passing in
>> {name: name, date: date}, and also specifying name and date in the
>> quasi-literal string.  It's not strictly necessary.
>>
>> Not saying that quasi-literals won't work, but things to consider.
>> It's an ambitious solution and it would be cool if we can forget about
>> escaping as application developers, but I see more than a few
>> obstacles.
>>
>> I think an existing library would help firm it up... I was thinking of
>> writing a mini-template.js that implements almost exactly what Python
>> 3k string formatting does.  This is a hole in the feature set of
>> JavaScript, irrespective of security and escaping.  I think a default
>> .format() method on strings would go a long way, rather than the
>> current "foo " + var + " bar" idiom.  Perhaps there could be a hook
>> for autoescaping, but it may be tricky.
>
> If you would like to write up a String.prototype.format strawman and
> post it at http://wiki.ecmascript.org/doku.php?id=strawman:strawman
> then it can be considered alongside competing strawmen like this
> proposal.
>
>> Andy
>>
>


More information about the es-discuss mailing list