Making the identifier identification strawman less restrictive
mathias at qiwi.be
Sun Oct 6 06:01:45 PDT 2013
This is about the identifier identification strawman: http://wiki.ecmascript.org/doku.php?id=strawman:identifier_identification
For tooling, it’s better to have a false positive than to have a false negative. In the case of identifier identification, it’s more useful to flag an identifier that is permitted as per the latest Unicode version as valid instead of rejecting it, even if it’s perhaps not supported in some engines that use data tables based on older Unicode versions.
In general, tools try to be lenient rather than restrictive in the input they accept. The list of ECMAScript 5 parsers that handle non-ASCII symbols in identifiers in the strawman backs this up: instead of using Unicode 3.0.0 data, more recent Unicode versions are used, in an attempt to handle as many technically valid identifiers as possible.
* Esprima and Acorn parse identifiers as per Unicode 6.3.0.
* UglifyJS v1 and v2 use Unicode 6.1.0, which as far as ECMAScript 5.1 identifiers go, is identical to Unicode 6.3.0.
For these reasons, I’d suggest changing the identifier identification proposal as follows. Step 8 currently says:
> If `edition` is `3` or `5`, let `unicode` be `3.0`.
Change that into step 8a:
> If `edition` is `3`, let `unicode` be `3.0`.
Then, add a new step `8b`:
> If `edition` is `5`, let `unicode` be `6.3`.
P.S. I’ve created an identifier identification prollyfill (https://github.com/mathiasbynens/identifier-identification) based on the current strawman. I’ll happily modify it if the strawman gets updated in any way.
More information about the es-discuss