ASC parsing bug?

Michael Daumling mdaeumli at
Wed Jun 18 15:32:13 PDT 2008

Regardless of how asc will work, how can we make sure that the test
suite files are compiled correctly? The affected e4x test just happens
to run correctly, but it does not actually test what it is supposed to.


-----Original Message-----
From: tamarin-devel-bounces at
[mailto:tamarin-devel-bounces at] On Behalf Of Thomas Reilly
Sent: Tuesday, June 17, 2008 6:17 AM
To: Mike Shaver; Steven Johnson
Cc: tamarin-devel at
Subject: RE: ASC parsing bug?

I don't know I kinda like the Java behavior of using the system
encoding.  The OSes typically know there locale seems silly not to use
it.   That's what will be the most convienent for folks and that's
probably how most other tools work, no?

Instead of making our own rules I think we should look at what our peers

javac (system encoding overridable with -encoding) python (assumes
ASCII, overridable with # -*- coding: utf-8 -*- at top of script) perl
(assumes ASCII/system I think with use utf8 pragma) 

Okay so if these 3 don't assume UTF8 its probably a bad idea for ASC to,

What we lack is a way to in the code express what the encoding is.
Does ES4 have any encoding pragmas?   Perl's "use utf8;" seems nice.

-----Original Message-----
From: tamarin-devel-bounces at on behalf of Mike Shaver
Sent: Mon 6/16/2008 3:40 PM
To: Steven Johnson
Cc: tamarin-devel at
Subject: Re: ASC parsing bug?
On Mon, Jun 16, 2008 at 6:32 PM, Steven Johnson <stejohns at>
> Having a tool like ASC try to guess the proper encoding sounds like a 
> recipe for long-term pain to me. (Hey, browser guys, how much fun is 
> it to guess the encoding of poorly-marked HTML? :-)

I'm going to be nice and pretend you didn't ask.

> IMHO, if the encoding isn't either (1) explicitly specified, or (2) 
> absolutely clear from a BOM, ASC should fail.

I think that is too harsh on the most common case: ASCII without BOM or
other adornments.  A default of UTF-8 seems pretty reasonable, and I
don't believe that UTF-8 requires a BOM since bytes are considered

If you want anything other than UTF-8, you should say so with an
explicit argument.

Tamarin-devel mailing list
Tamarin-devel at

Tamarin-devel mailing list
Tamarin-devel at

More information about the Tamarin-devel mailing list