Making the identifier identification strawman less restrictive

Mathias Bynens mathias at
Wed Oct 9 00:27:39 PDT 2013

CC’ing the creators of the tools we’ve been talking about to get their input. Hi guys! Please start reading here:

On 9 Oct 2013, at 07:48, Norbert Lindenberg <ecmascript at> wrote:

> - For a code transformation tool, such as CoffeeScript, I agree that you probably don't want to introduce any artificial restrictions, so you want to use the latest Unicode version possible. Step 10 of the proposed algorithm ("let unicode be the Unicode version supported by the implementation in ECMAScript identifiers") is intended to cover that case.

But that makes it an implementation-dependent impure function, which is unacceptable for code transformation tools like CoffeeScript and parsers like Esprima, Acorn, or UglifyJS. They’d support certain identifiers in engine A but not in engine B, without any control over it. If this is how `String.isIdentifier{Start,Part}` works I think these tools will stick to their custom identifier identification methods, which would defeat the purpose of the entire strawman. (Ariya, Marijn, Mihai: any thoughts?)

>> For these reasons, I’d suggest changing the identifier identification proposal as follows. […]
> That would create several problems:
> - The Unicode version for ES 5 would be above that for ES 6 (step 9).

I would love to see that changed too as per, but that’s an issue with the main ES spec.

> - Tools like JSHint, if they want to ensure compatibility with all ES 5 implementations, would have to lie and specify ES 3.

They don’t at the moment. @Anton, any thoughts?

> - Step 11 would allow all Unicode code points that are matched by the IdentifierStart production, including supplementary code points, which ES 5 does not permit in identifiers. (Note that Unicode 3.0, the version referenced by the ES 3 and ES 5 specs, was the last one that did not define any supplementary characters, so the spec as proposed doesn't have that problem).

Step 11 says “If cp is matched by the IdentifierStart production in edition `edition` of the ECMAScript Language Specification using Unicode version `unicode`, then return `true`” so this is not a problem either way. ES5 `IdentifierStart` doesn’t include supplementary code points, like you said, because of the way ES5 defines “character”.

> - Implementations that don't support Unicode 6.3 yet, e.g., because they rely on Unicode information provided by the operating system, would not be able to comply with the spec.

Which implementations do that? The ones I’ve seen all use custom-generated Unicode data files. Is this really an issue?

More information about the es-discuss mailing list