Close review of Language Overview whitepaper

Graydon Hoare graydon at mozilla.com
Wed Nov 14 16:22:47 PST 2007


Maciej Stachowiak wrote:
> Hello ES4 fans,
> 
> I have now read the recently posted whitepaper. I marked up my printed  
> copy with many comments in the margins, and I am sharing them with the  
> list now.

Excellent comments, thanks!

Some points in response, both places where I agree with you and places I 
might offer some explanation.

> Classes: If any of the new type system is worthwhile, surely this is.  
> The impedance mismatch between the class model used by most OO  
> languages and by specifications like the DOM, and ES3's prototype  
> model, is needlessly confusing to authors. So I approve of adding  
> classes in a reasonable and tasteful way.
> 
> Dynamic properties: the fact that the "dynamic" behavior is not  
> inherited makes class inheritence violate the Liskov Substitution  
> Principle. I think this is a problem. Subclassing should be subtyping  
> in the LSP sense. I am not sure offhand how to fix this.

LSP is not preserved by most forms of OO inheritence, alas; as soon as 
you have overriding of methods, it's gone.

> Virtual Properties: I wish the keyword for catchall getters and  
> setters was something other than "meta", which is a vague word that  
> doesn't mean much. Why not "catchall" or "fallback" or something along  
> similarly concrete lines? (I realize now upon re-reading my margin  
> comments that this is supposed to match meta invoke, but there too I  
> am not sure the relationship is worth the vagueness.)

Plausible. We reserved the namespace for "stuff that you would use a 
fully-formed MOP to fiddle with", but other names would work.

> Literals:
> - I am surprised to see a decimal type (a type that is not directly  
> supported in current mainstream hardware) even though generally  
> popular types like single-precision IEEE floating point and 64 bit  
> integers are not present.

This is based on bug-report / user-request frequency. ES is used by a 
lot of people who do not do numerical computation for a living, notably 
in casual yet meaningful financial circumstances. Most financial math in 
ES3 is presently wrong.

Whether or not it shows up in the hardware (IBM and Intel have both made 
varying degrees of commitment) is mostly irrelevant: the key is actually 
arriving at correct answers in contexts like tax law and shopping carts, 
where it really matters.

> - Since ints/uints overflow to doubles, then either all int math must  
> be performed in double space (requiring constant conversions when  
> working with int variables), or every operation must check for  
> overflow and possibly fall back to double space. Even when the final  
> result cannot overflow, certainly in many expressions the difference  
> between int and double intermediates can be observed. It seems likely,  
> then, that math on variables declared int will be slower than math on  
> variables declared double, which will surely be confusing to  
> developers. This seems pretty bogus. Is there any case where int math  
> using the normal operators can actually be efficient? Would it be  
> plausible to make ints *not* overflow to double unless there is an  
> actual double operand involved (in which case int constants would  
> always need a special suffix, or perhaps can somehow be determined  
> contextually).

Possible. It would require unadorned literals to default to double (with 
optimized-but-overflowing int as an invisible subcase, as in ES3), and 
users to request specific instances of int and uint types with 'i' and 
'u' suffixes. Fits with the treatment of decimal. Might be preferable to 
the uint32ops namespace.

> Record and array types: Structural types are confusingly similar to  
> yet different from classes. Mostly they offer a subset of class  
> functionality (though reading ahead I did see a few features limited  
> to them). Also, already having prototype-based objects and class-based  
> objects it seems excessive to add yet a third way. I recommend  
> removing them and adding any features that are sorely missed as a  
> result to classes.

I concur that this step would greatly simplify the language; I have no 
clear feeling about whether the structural types are more or less 
crucial for interoperability.

The arrival at "structural types" as a general term followed from the 
consensus that we wanted to capture anonymous function and union types. 
Arrays and records *without* class-hierarchy membership seemed to polish 
off the needs of modeling existing ES3 style latent types, but perhaps 
merging them both back into classes and using a sufficiently weak "like" 
operator will do. We've been around this block many times.

> "Any": The spec explains vaguely that the "any" type is not identical  
> to the union (null, undefined, Object). How is it different? Is the  
> difference observable to ES4 programs or is it purely a matter  
> internal to the spec (in which case the difference is not relevant)?

I do not think that it is observably different. I may be incorrect; best 
for the type theorists to respond.

> Nullability: Are non-nullable types really worth it? I am not sure.  
> Does any other explicit type system for a dynamic OO language have  
> such a concept? The whitepaper says that "the ability to store null is  
> occasionally the source of run-time errors" but will not dynamic- 
> checking result in runtime errors anyway when assigning null to a non- 
> nullable variable (except in strict mode)?

Well, I argued for this initially, so I guess I'll defend it :)

It is precisely this ability to "push the null pointer error earlier" 
that we seek. If you push it as far back as a function signature or 
interface barrier, it is easier to detect, comprehend and recover from. 
If you push it back to strict-check time, it is easier still. This is 
all well-demonstrated in other OO languages with static types: C# 
("structs"), nice, haxe, felix, scala ... it's not terribly new.

For dynamic languages that bother providing useful and fine-grained type 
predicates, well ... that's not a huge number of languages! But consider 
common lisp, it does what we do, as far as I know: there's an empty type 
that holds no values (our void type, their nil type) and there's a value 
that holds a unique sentinel (their nil value, our null value) which 
lives in its own unique type (their null type, our null type). Like what 
we're doing, you need to union this with a class type to get a nullable 
class type.

(As far as I can tell -- not being a dylan hacker -- dylan doesn't even 
go as far as having a global sentinel type like nil)

> package: Now that I have learned more about them, I think that  
> exposing packages and namespaces as separate user-level concepts is  
> confusing. Let's get this down to a single concept that developers  
> have to learn. Namespaces can just have a paired internal namespace  
> implicitly, I do not think it is helpful to give the public/internal  
> pair a special different name.

I too am a bit uneasy about the proliferation of concepts here (package, 
unit, and namespace). We've tried to fuse some of them in the past, but 
with little luck. Your suggestion is good. Let's flesh it out: make a 
bug. It has to support the notion of a public and private "half", one of 
which can be named from outside the syntactic form that defines the 
namespace, one of which can only be used from *inside* the syntactic 
form that defines it.

> Program units:
> - Is there any need for the concept of "unit" to be exposed in the  
> syntax? Why not just allow "use unit" at top level, and implicitly  
> make each file (or in the browser context each inline script) a unit?

We've heard input that a server-side / ahead-of-time transformation of 
inlining nested units is desirable, in the sense that it makes for a 
single chunk of text you can send in a single HTTP response. The 
syntactic form was designed to permit this possibility: you can replace 
a use with its definition textually, without altering the meaning.

> - I think the difference between using units and importing packages is  
> going to be confusing to authors. Seriously, can anyone explain in one  
> sentence of 12 words or less how Joe Random Developers will decide  
> whether to use a namespace, import a package, or use a unit? Can we  
> get this down to only one kind of thing that needs to be mentioned in  
> the syntax? This would be a big win in reducing conceptual footprint.

True, though I suspect at least 2 of them need to remain. Namespaces are 
quite disjoint from loading and definition order. My eye is on packages.

> Type annotations and type checking: This section implies that type  
> annotations are not at all being added for performance reasons and may  
> indeed be harmful to performance. Wow! Seriously? I think runtime  
> assertions are interesting when debugging but I do would not want them  
> happening for every assignment statement in a release build of my C++  
> code. 

You may not be the sole voice of opinion on that matter :)

Types are the meeting point between two different -- and equally 
important! -- pressures felt throughout programming: correctness and 
speed. It is fair to discuss both; you're right that it is unfair to 
deny the motivation of one. Typed ES4 programs *may* run faster due to a 
variety of optimizations enabled by types. Whether those are performed 
is up to implementations. Specifically:

   - Dense allocation of the fixed portion of an object
   - Early-binding of fixture references to fixed properties
   - Inlining after early binding
   - Specialized selection of primitive operators

However, note that an implementation of ES4 clever enough to perform a 
large set of these optimizations may not be terribly different from an 
implementation of *ES3* clever enough to synthesize types optimistically 
on the fly and check its guesses against runtime facts, correcting any 
that rest on violated assumptions. These sorts of clever runtimes may 
actually notice no speed difference whatsoever.

What a clever runtime *cannot* do is synthesize the programmer's 
intentions wrt. type-based correctness conditions of their code. The 
programmer needs to say "this variable being an int is part of what it 
means to be correct". That is why there is a focus on the correctness 
motivation of types here, not the speed motivation.

> Pragmas: The "use decimal" pragma highlights how much complexity there  
> is to the decimal type. Seriously, is it worth it? Is the problems it  
> solves really that common?

Yes. Users have been reporting their displeasure with binary floating 
point arithmetic on the web since way back in the 20th century.

See also: http://www2.hursley.ibm.com/decimal/decifaq1.html#dbstats

> Generators: Do ordinary programmers really understand coroutine  
> control flow? Is this really a significantly better paradigm than  
> passing a visitor function? Not really convinced in this one yet.

The key purpose is to make for loops look like for loops. Agreed that it 
might be a heavy way of achieving it, but it divides the labor 
asymmetrically: the library author does the hard work, the users don't. 
With visitor functions, the users all have to get cozy with higher order 
functions.

> "switch type" statement: I guess this beats switching on typeof, but  
> is it really significantly better than a series of "if" statements  
> using the "is" operator?

It is better from a static-type perspective, in an implementation that 
wants to optimize union type representations. Maybe this is not 
sufficiently compelling. I suggested it, and I concur that it's trimmable.

> Expression closures: I actually find the examples hard to follow given  
> my expectation of ES3-like syntax. I think this may actually be  
> syntactic salt.

No opinion. I know how to write either way.

> "type": Are runtime meta-objects representing types ruly necessary?  
> What are they good for?

Tooling, runtime code synthesis and analysis, runtime adaptation to code 
added at a later date, etc. Completeness of the dynamic nature of the 
language ... ask a smalltalk or lisp person :)

> Slicing: This one I mildly object to. Array/String slicing is not, to  
> my knowledge, particularly common in ECMAScript code of today. I am  
> dubious that it merits its own operator syntax.

No opinion. Similar to expression closures: willing to trim.

> Early binding, static type checking, and predictable behavior with  
> "intrinsic": Perhaps it should be highlighted more that this is a  
> potential significant performance improvement.

Agreed! If it took until this late in the document to understand that 
angle, it ought to move up.

> Reflection: This feature seems like it could be complex to implement  
> and potentially unnecessary for small implementations. I note that  
> J2ME omits reflection, which we can perhaps take as a sign that it is  
> not suitable for small implementations.

Not clear. The language has to keep quite a lot of machinery around for 
its dynamic aspects anyways; the idea is to given an interface through 
which the runtime can be asked to lazily manufacture reflective bits 
that cover only the things it already has to carry. If you see things 
that you think could be optimized away while still executing the rest of 
the language, by all means suggest cutting.

> ControlInspector: I think an interface that's meant for debuggers and  
> similar tools, and not implementable in all interesting contexts, does  
> not need to be standardized. Better than having an optional feature.

Not clear. Debuggers are one perspective, sure; but general dynamic 
scope is a useful (if sharp and dangerous) feature for advanced library 
authors (security contexts, continuations, dynamically-scoped 
resources), and they can't get it otherwise without interposing wrapper 
functions everywhere, and  that defeats tail calls.

> uint-specific operations: This is syntactically ugly. Why can't  
> integer math just always work this way? Also, why only uint versions?  
> Surely it is desirable to do efficient math on signed integers as  
> well. Also, bitops already happen in integer math space, thus type- 
> specific versions should not be necessary since no floating point  
> conversion will need to occur if both operands of ^ or & are  
> statically typed as int or uint.

As said above, your ideas about how better to reform int/uint/double 
arithmetic and promotion are compelling. Can we move it to a bug and 
hash it out there? I don't want to lose it.

Again, thanks *so much* for giving it the fine-tooth comb treatment. 
This is valuable feedback.

-Graydon



More information about the Es4-discuss mailing list