Close review of Language Overview whitepaper
Graydon Hoare
graydon at mozilla.com
Wed Nov 14 16:22:47 PST 2007
Maciej Stachowiak wrote:
> Hello ES4 fans,
>
> I have now read the recently posted whitepaper. I marked up my printed
> copy with many comments in the margins, and I am sharing them with the
> list now.
Excellent comments, thanks!
Some points in response, both places where I agree with you and places I
might offer some explanation.
> Classes: If any of the new type system is worthwhile, surely this is.
> The impedance mismatch between the class model used by most OO
> languages and by specifications like the DOM, and ES3's prototype
> model, is needlessly confusing to authors. So I approve of adding
> classes in a reasonable and tasteful way.
>
> Dynamic properties: the fact that the "dynamic" behavior is not
> inherited makes class inheritence violate the Liskov Substitution
> Principle. I think this is a problem. Subclassing should be subtyping
> in the LSP sense. I am not sure offhand how to fix this.
LSP is not preserved by most forms of OO inheritence, alas; as soon as
you have overriding of methods, it's gone.
> Virtual Properties: I wish the keyword for catchall getters and
> setters was something other than "meta", which is a vague word that
> doesn't mean much. Why not "catchall" or "fallback" or something along
> similarly concrete lines? (I realize now upon re-reading my margin
> comments that this is supposed to match meta invoke, but there too I
> am not sure the relationship is worth the vagueness.)
Plausible. We reserved the namespace for "stuff that you would use a
fully-formed MOP to fiddle with", but other names would work.
> Literals:
> - I am surprised to see a decimal type (a type that is not directly
> supported in current mainstream hardware) even though generally
> popular types like single-precision IEEE floating point and 64 bit
> integers are not present.
This is based on bug-report / user-request frequency. ES is used by a
lot of people who do not do numerical computation for a living, notably
in casual yet meaningful financial circumstances. Most financial math in
ES3 is presently wrong.
Whether or not it shows up in the hardware (IBM and Intel have both made
varying degrees of commitment) is mostly irrelevant: the key is actually
arriving at correct answers in contexts like tax law and shopping carts,
where it really matters.
> - Since ints/uints overflow to doubles, then either all int math must
> be performed in double space (requiring constant conversions when
> working with int variables), or every operation must check for
> overflow and possibly fall back to double space. Even when the final
> result cannot overflow, certainly in many expressions the difference
> between int and double intermediates can be observed. It seems likely,
> then, that math on variables declared int will be slower than math on
> variables declared double, which will surely be confusing to
> developers. This seems pretty bogus. Is there any case where int math
> using the normal operators can actually be efficient? Would it be
> plausible to make ints *not* overflow to double unless there is an
> actual double operand involved (in which case int constants would
> always need a special suffix, or perhaps can somehow be determined
> contextually).
Possible. It would require unadorned literals to default to double (with
optimized-but-overflowing int as an invisible subcase, as in ES3), and
users to request specific instances of int and uint types with 'i' and
'u' suffixes. Fits with the treatment of decimal. Might be preferable to
the uint32ops namespace.
> Record and array types: Structural types are confusingly similar to
> yet different from classes. Mostly they offer a subset of class
> functionality (though reading ahead I did see a few features limited
> to them). Also, already having prototype-based objects and class-based
> objects it seems excessive to add yet a third way. I recommend
> removing them and adding any features that are sorely missed as a
> result to classes.
I concur that this step would greatly simplify the language; I have no
clear feeling about whether the structural types are more or less
crucial for interoperability.
The arrival at "structural types" as a general term followed from the
consensus that we wanted to capture anonymous function and union types.
Arrays and records *without* class-hierarchy membership seemed to polish
off the needs of modeling existing ES3 style latent types, but perhaps
merging them both back into classes and using a sufficiently weak "like"
operator will do. We've been around this block many times.
> "Any": The spec explains vaguely that the "any" type is not identical
> to the union (null, undefined, Object). How is it different? Is the
> difference observable to ES4 programs or is it purely a matter
> internal to the spec (in which case the difference is not relevant)?
I do not think that it is observably different. I may be incorrect; best
for the type theorists to respond.
> Nullability: Are non-nullable types really worth it? I am not sure.
> Does any other explicit type system for a dynamic OO language have
> such a concept? The whitepaper says that "the ability to store null is
> occasionally the source of run-time errors" but will not dynamic-
> checking result in runtime errors anyway when assigning null to a non-
> nullable variable (except in strict mode)?
Well, I argued for this initially, so I guess I'll defend it :)
It is precisely this ability to "push the null pointer error earlier"
that we seek. If you push it as far back as a function signature or
interface barrier, it is easier to detect, comprehend and recover from.
If you push it back to strict-check time, it is easier still. This is
all well-demonstrated in other OO languages with static types: C#
("structs"), nice, haxe, felix, scala ... it's not terribly new.
For dynamic languages that bother providing useful and fine-grained type
predicates, well ... that's not a huge number of languages! But consider
common lisp, it does what we do, as far as I know: there's an empty type
that holds no values (our void type, their nil type) and there's a value
that holds a unique sentinel (their nil value, our null value) which
lives in its own unique type (their null type, our null type). Like what
we're doing, you need to union this with a class type to get a nullable
class type.
(As far as I can tell -- not being a dylan hacker -- dylan doesn't even
go as far as having a global sentinel type like nil)
> package: Now that I have learned more about them, I think that
> exposing packages and namespaces as separate user-level concepts is
> confusing. Let's get this down to a single concept that developers
> have to learn. Namespaces can just have a paired internal namespace
> implicitly, I do not think it is helpful to give the public/internal
> pair a special different name.
I too am a bit uneasy about the proliferation of concepts here (package,
unit, and namespace). We've tried to fuse some of them in the past, but
with little luck. Your suggestion is good. Let's flesh it out: make a
bug. It has to support the notion of a public and private "half", one of
which can be named from outside the syntactic form that defines the
namespace, one of which can only be used from *inside* the syntactic
form that defines it.
> Program units:
> - Is there any need for the concept of "unit" to be exposed in the
> syntax? Why not just allow "use unit" at top level, and implicitly
> make each file (or in the browser context each inline script) a unit?
We've heard input that a server-side / ahead-of-time transformation of
inlining nested units is desirable, in the sense that it makes for a
single chunk of text you can send in a single HTTP response. The
syntactic form was designed to permit this possibility: you can replace
a use with its definition textually, without altering the meaning.
> - I think the difference between using units and importing packages is
> going to be confusing to authors. Seriously, can anyone explain in one
> sentence of 12 words or less how Joe Random Developers will decide
> whether to use a namespace, import a package, or use a unit? Can we
> get this down to only one kind of thing that needs to be mentioned in
> the syntax? This would be a big win in reducing conceptual footprint.
True, though I suspect at least 2 of them need to remain. Namespaces are
quite disjoint from loading and definition order. My eye is on packages.
> Type annotations and type checking: This section implies that type
> annotations are not at all being added for performance reasons and may
> indeed be harmful to performance. Wow! Seriously? I think runtime
> assertions are interesting when debugging but I do would not want them
> happening for every assignment statement in a release build of my C++
> code.
You may not be the sole voice of opinion on that matter :)
Types are the meeting point between two different -- and equally
important! -- pressures felt throughout programming: correctness and
speed. It is fair to discuss both; you're right that it is unfair to
deny the motivation of one. Typed ES4 programs *may* run faster due to a
variety of optimizations enabled by types. Whether those are performed
is up to implementations. Specifically:
- Dense allocation of the fixed portion of an object
- Early-binding of fixture references to fixed properties
- Inlining after early binding
- Specialized selection of primitive operators
However, note that an implementation of ES4 clever enough to perform a
large set of these optimizations may not be terribly different from an
implementation of *ES3* clever enough to synthesize types optimistically
on the fly and check its guesses against runtime facts, correcting any
that rest on violated assumptions. These sorts of clever runtimes may
actually notice no speed difference whatsoever.
What a clever runtime *cannot* do is synthesize the programmer's
intentions wrt. type-based correctness conditions of their code. The
programmer needs to say "this variable being an int is part of what it
means to be correct". That is why there is a focus on the correctness
motivation of types here, not the speed motivation.
> Pragmas: The "use decimal" pragma highlights how much complexity there
> is to the decimal type. Seriously, is it worth it? Is the problems it
> solves really that common?
Yes. Users have been reporting their displeasure with binary floating
point arithmetic on the web since way back in the 20th century.
See also: http://www2.hursley.ibm.com/decimal/decifaq1.html#dbstats
> Generators: Do ordinary programmers really understand coroutine
> control flow? Is this really a significantly better paradigm than
> passing a visitor function? Not really convinced in this one yet.
The key purpose is to make for loops look like for loops. Agreed that it
might be a heavy way of achieving it, but it divides the labor
asymmetrically: the library author does the hard work, the users don't.
With visitor functions, the users all have to get cozy with higher order
functions.
> "switch type" statement: I guess this beats switching on typeof, but
> is it really significantly better than a series of "if" statements
> using the "is" operator?
It is better from a static-type perspective, in an implementation that
wants to optimize union type representations. Maybe this is not
sufficiently compelling. I suggested it, and I concur that it's trimmable.
> Expression closures: I actually find the examples hard to follow given
> my expectation of ES3-like syntax. I think this may actually be
> syntactic salt.
No opinion. I know how to write either way.
> "type": Are runtime meta-objects representing types ruly necessary?
> What are they good for?
Tooling, runtime code synthesis and analysis, runtime adaptation to code
added at a later date, etc. Completeness of the dynamic nature of the
language ... ask a smalltalk or lisp person :)
> Slicing: This one I mildly object to. Array/String slicing is not, to
> my knowledge, particularly common in ECMAScript code of today. I am
> dubious that it merits its own operator syntax.
No opinion. Similar to expression closures: willing to trim.
> Early binding, static type checking, and predictable behavior with
> "intrinsic": Perhaps it should be highlighted more that this is a
> potential significant performance improvement.
Agreed! If it took until this late in the document to understand that
angle, it ought to move up.
> Reflection: This feature seems like it could be complex to implement
> and potentially unnecessary for small implementations. I note that
> J2ME omits reflection, which we can perhaps take as a sign that it is
> not suitable for small implementations.
Not clear. The language has to keep quite a lot of machinery around for
its dynamic aspects anyways; the idea is to given an interface through
which the runtime can be asked to lazily manufacture reflective bits
that cover only the things it already has to carry. If you see things
that you think could be optimized away while still executing the rest of
the language, by all means suggest cutting.
> ControlInspector: I think an interface that's meant for debuggers and
> similar tools, and not implementable in all interesting contexts, does
> not need to be standardized. Better than having an optional feature.
Not clear. Debuggers are one perspective, sure; but general dynamic
scope is a useful (if sharp and dangerous) feature for advanced library
authors (security contexts, continuations, dynamically-scoped
resources), and they can't get it otherwise without interposing wrapper
functions everywhere, and that defeats tail calls.
> uint-specific operations: This is syntactically ugly. Why can't
> integer math just always work this way? Also, why only uint versions?
> Surely it is desirable to do efficient math on signed integers as
> well. Also, bitops already happen in integer math space, thus type-
> specific versions should not be necessary since no floating point
> conversion will need to occur if both operands of ^ or & are
> statically typed as int or uint.
As said above, your ideas about how better to reform int/uint/double
arithmetic and promotion are compelling. Can we move it to a bug and
hash it out there? I don't want to lose it.
Again, thanks *so much* for giving it the fine-tooth comb treatment.
This is valuable feedback.
-Graydon
More information about the Es4-discuss
mailing list