Identifying ECMAScript identifiers

Yusuke SUZUKI utatane.tea at gmail.com
Fri Mar 8 02:35:03 PST 2013


>
> These tools have to be able to recognize ECMAScript identifiers, taking
> the identifier specification and the underlying Unicode specification into
> consideration - not quite easy given the ever-growing Unicode character set.


Yeah. We, Esprima developers, parse UnicodeData.txt to generate identifier
identification functions.
I wrote simple UnicodeData.txt parser and generated RegExp[1]. These
functions are also used in Acorn.

In Esprima and Acorn, because of performance issue, their identifier
identification functions require a code point as number, not string[2][3].
So I suggest accepting a code point number as an argument.

[1] https://code.google.com/p/esprima/issues/detail?id=110
[2] https://github.com/ariya/esprima/blob/master/esprima.js#L229
[3] https://github.com/marijnh/acorn/blob/master/acorn.js#L421


On Fri, Mar 8, 2013 at 6:42 PM, gaz Heyes <gazheyes at gmail.com> wrote:

> You forgot to include MentalJS. I can parse 120k identifier in 5ms on
> Firefox on my crappy machine. My method is much faster than any of the
> parsers you listed and I handle unicode escapes too.
> http://businessinfo.co.uk/labs/MentalJS/MentalJS.html
>
>
> On 8 March 2013 07:35, Norbert Lindenberg <
> ecmascript at lindenbergsoftware.com> wrote:
>
>> ECMAScript is used to implement a variety of tools that check code for
>> conformance with the ECMAScript specification, minimize it, perform other
>> transformations, or generate ECMAScript code. These tools have to be able
>> to recognize ECMAScript identifiers, taking the identifier specification
>> and the underlying Unicode specification into consideration - not quite
>> easy given the ever-growing Unicode character set.
>>
>> While looking at support for Unicode character properties in general, I
>> realized that this use case is shaped differently from others, fundamental
>> to ECMAScript, and amenable to a fairly simple solution, and so there's now
>> a strawman:
>> http://wiki.ecmascript.org/doku.php?id=strawman:identifier_identification
>>
>> I'd like to discuss this at next week's TC 39 meeting, but also invite
>> earlier comments.
>>
>> Thanks,
>> Norbert
>>
>> _______________________________________________
>> es-discuss mailing list
>> es-discuss at mozilla.org
>> https://mail.mozilla.org/listinfo/es-discuss
>>
>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>


-- 
Regards,
Yusuke Suzuki
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130308/567e90f6/attachment.html>


More information about the es-discuss mailing list