Surprising semantics
Ingvar von Schoultz
ingvar-v-s at comhem.se
Mon Jul 21 01:44:47 PDT 2008
Waldemar Horwat wrote:
> The problem is that you then get a plethora of ways
> to define things:
> [...]
> Furthermore, some of them don't make sense (such as
> "function" without "let") because they can conditionally
> capture variables that may not even exist.
Despite very much searching in the discussion archives I can't
find a single description that convinces me that there is in
fact a problem. My analyses find simple solutions for all the
situations I can think of, including nonexistent let variables.
But maybe I just haven't understood the problem. Could you give
me a link to a description?
In my analyses, all you need to do is specify what |function|
without |let| is supposed to mean, in a way that is well suited
to how ECMAScript declarations behave.
You get useful, helpful semantics, and proper throwing of errors
on incorrect access, with simple implementation, if you think how
you would manually make the function name accessible in the outer
scope, and then let the compiler make that very arrangement. The
result is useful and intuitive and avoids peculiar irregularities.
Of course it works only if it can be handled with rules that are
simple enough that the compiler can deal with all cases. I'll take
this in steps to show that the rules become simple enough.
Consider the semantics of a standard function declaration-and-
definition-combined:
Fn();
function Fn() {return 1}
As with any declaration in ECMAScript, the declaration of the name
Fn takes effect before you enter the scope.
This name gets the special treatment that is afforded to functions:
It is assigned its value before you enter the scope. This value is
the function object. So the call to Fn is successful.
Consider the difference when it's conditional:
Fn();
if (Unknown)
function Fn() {return 1}
else
function Fn() {return 2}
Again, as with any declaration, the declaration of the name takes
effect before you enter the scope.
However, in this case it can't be assigned any value before you
enter the scope, since there isn't any known value. Fn exists but
is unassigned, it has the special value |undefined|. The call to
Fn() throws an error as an attempt to call undefined().
The above is equivalent to the following, which is how you would
do the same thing manually if you wanted the same result including
the bug:
var Fn; // Automatically assigned the value |undefined|.
Fn();
if (Unknown)
Fn = function Fn() {return 1}
else
Fn = function Fn() {return 2}
So where the programmer wrote function declarations the compiler
arranges assignment in cases like these.
Even though this changes the behavior of function(), in that the
assignment comes later than usual, this is not a case of hidden
surprising semantics. The programmer did specify that the function
Fn depends on if(Unknown). This obeys what the programmer said.
It would be wrong to decide upon one of the two functions and assign
that before scope entry, as it would violate the requirement that
Fn depend on if(Unknown). What's more, even if the condition is
known at compilation time, the programmer is using a construct
that is intrinsically sequential. So regardless of what is known,
the most exact interpretation is still to maintain the sequential
nature. This way you get simple, consistent semantics.
Let's move the function call into a block and have Fn hoist out
of that:
print (Fn);
if (Unknown)
{ Fn();
function Fn() {return 1} // Hoisted to global scope.
}
The compiler should do this:
var Fn; // Hoisted name, assigned the value |undefined|.
print (Fn);
if (Unknown)
{ Fn = HiddenName; // Early assignment at beginning of block.
Fn();
function HiddenName() {return 1}
}
(Except the function knows itself as Fn rather than HiddenName.)
Here the name Fn cannot have a value at the beginning of the global
scope, but it can have a value at the beginning of the block where
it's declared-and-defined.
The special treatment of functions, where the compiler moves the
assignment to the beginning of the block, should happen when the
compiler can determine that this is correct, using simple rules,
rules that are simple not only for the compiler but also for the
programmer. In any situation where the compiler can't easily
determine this, it assigns |undefined| at block entry, and then
assigns function object at the spot where the function is defined.
There may be two block-entry points to consider, as above.
Let's hoist with a nonexistent let variable:
Fn();
if (Unknown)
{ let LetVal = 3; // Nonexistent when Fn() is called.
if (Maybe)
{ function Fn() // Hoisting to Outer.
{ return LetVal; // Return the let variable.
}
}
}
This is just like the other cases. At the beginning of the global
scope the name Fn exists and has the value |undefined|. Any call
throws an error. Only deep down among the nested blocks is Fn
assigned the function object. This happens at the spot where
the programmer wrote the declaration-and-definition (or, more
precisely, shortly before entry into that block). Like this:
var Fn; // Automatically assigned the value |undefined|.
Fn();
if (Unknown)
{ let LetVal = 3;
if (Maybe)
{ Fn = function Fn()
{ return LetVal;
}
}
}
As far as I can see, any attempt to call Fn at any point in time
when LetVal doesn't exist will fail as a call to undefined(). And
at any point in time where the value of Fn is a function object,
Fn will have access to an existing LetVal.
I don't know if I've missed something. Maybe the problem is in a
different program structure. I'd be very interested to know.
> for (i = 0; i < foo.length; i++) {
> const v = foo[i];
> }
>
> You'll catch everyone off-guard if you make folks do a
> let const instead of a const here.
What does this mean? (English isn't my first language.) Are you
saying that everyone will forget to type let?
On the contrary, that can only happen to those who are not used
to JavaScript. And of course also to those who feel so uncomfortable
with the scoping rules of JavaScript that they have to put all
their var declarations out in the function block. This of course
perpetuates their difficulties.
But for us real javascripters, who fully embrace and enjoy its
scoping, and put our var declarations anywhere and everywhere,
the above is very, very obviously a repeated assignment to
one and the same v. That const before it leaps out at you as
a contradiction. Really, there's no way at all that you can
miss it.
Your solution will only trip up us real javascripters. It will
cater only to those who can't or won't embrace the language for
real. But their problem is temporary. Catering to them by creating
complicated oddities makes the language permanently complicated.
For them it's just a simple detail to learn, the meaning of the
JavaScript brace. A single really simple detail, versus a complicated
set of oddities and hidden semantics.
There may be other reasons to infest the language with complications,
but the short-lived minor effort for some people to learn one detail
is not one of them.
Ingvar
Waldemar Horwat wrote:
> We've been down this road before, and the arguments you present have been hashed out over years. This approach doesn't work. Read the archives of the ES4 group.
>
> The problem is that you then get a plethora of ways to define things:
>
> var
> const
> function
> type
> namespace
> let
> let const
> let function
> let type
> let namespace
>
> Furthermore, some of them don't make sense (such as "function" without "let") because they can conditionally capture variables that may not even exist.
>
> The example you give of conditional definitions:
>
> if (foo) {
> const c = 37;
> } else {
> const c = "abc";
> }
> ... do something with c ...
>
> is particularly disruptive. You must then support conditional holes:
>
> // outer scope
> function c() ...;
>
> // inner scope
> {
> if (foo) {
> const c = 37;
> }
> ... c can be either 37 or the outer scope function here ...
> }
>
>
> It gets worse:
>
> // outer scope
> function c() ...;
>
> // inner scope
> {
> function f() {
> return c;
> }
> a = f();
> if (foo) {
> const c = 37;
> }
> b = f();
> ... just what do a and b hold here? Was f's captured variable rebound by the if statement? ...
> }
>
>
> Also consider:
>
> for (i = 0; i < foo.length; i++) {
> const v = foo[i];
> }
>
> You'll catch everyone off-guard if you make folks do a let const instead of a const here.
>
>
> In E4 it gets worse still because c can have a type:
>
> type c = ...
> {
> if (foo) {
> const c:Number = 37;
> } else if (bar) {
> var c:String = "abc";
> }
> }
> ... do something with c, which is either a type, a constant, or a variable, and can be statically typed as either a Number or a String ...
>
> const d:c = ... // Conditional definition requires variable types to be evaluated at run-time, which is not somewhere we want to go in the first version
>
> I don't know of anyone here who wants to support something like that.
>
> Waldemar
>
>
> Ingvar von Schoultz wrote:
>> These are some impressions looking at what I expect from the
>> language, and how some things in the specification can cause
>> confusion.
>>
>> I would have contributed here during the discussions, but I
>> discovered the mailing lists just a couple of days ago.
>>
>> I expect the compiler's interpretation of program-code text
>> to be close to my intuitive understanding of what the text
>> says. It's very unfortunate if keywords have unexpected
>> meanings that cause mysterious side effects.
>>
>> If I learn that ECMAScript will let me change my var(iables)
>> into const(ants) I expect this to turn them into constants, in
>> the sense that trying to change their value will be considered
>> an error. It's very disappointing that by default they are
>> instead defined to have the baffling and mysterious behavior
>> of silently ignoring an attempt to change them, acting as if
>> no error had occurred.
>>
>> You'll have to keep this oddity in mind at all times, and even
>> then errors related to this will sometimes cause symptoms to
>> appear far from where the error is, costing quite some time
>> to explore. Why doesn't my program change its behavior even
>> though I'm provoking changes? Where in this big program's
>> complicated sequence of events is the change silently, secretly
>> lost?
>>
>> If instead you use var, at least the problems that can come
>> from this will tend to give symptoms closely connected to the
>> incorrect change in the value.
>>
>> So this is a disappointing red flag: Don't use const, it is
>> likely to cause baffling problems and unlikely to help.
>>
>> Unfortunately there's another problem with const that is much
>> more important. I often use constants for conditional settings:
>>
>> if (Debugging)
>> { var DatabaseName = "TestDatabase";
>> var DisplayCount = 5;
>> }
>> else
>> { var DatabaseName = "RealDatabase";
>> var DisplayCount = 15;
>> }
>>
>> The redundant "var"s are a defensive habit, omitting them would
>> be a warning about accesses outside the current scope.
>>
>> If I haven't been warned, and hear that ECMAScript understands
>> "const", I expect that replacing "var" with "const" will change
>> the above from variables into constants. The keyword in no way
>> suggests that it will hide them from view. If they disappear
>> I'll inevitably consider such a completely unrelated side effect
>> a compiler bug.
>>
>> Because of this I'm unhappy about the conclusions of ES3.1 that
>> the visibility scope of "const" should be the enclosing brace-
>> delimited block. Such intricate semantics hidden in words that
>> express something completely unrelated will make the language
>> seem difficult and fraught with hidden surprises.
>>
>> I much prefer what ES4 says in various places on the website:
>> that you express this localness with "let const" and "let function".
>> One block-scope keyword for all block-scope visibility. Consistency
>> and clarity.
>>
>> However this brings me to the unfortunate word "let". Although
>> this word has a precise and clear technical meaning for the
>> initiate, for us in the unwashed masses I can't see that the
>> English word "let" gives even the remotest suggestion of local
>> containment. In fact it suggests very clearly that it's related
>> to the "=" that so often follows:
>>
>> if (x == 5)
>> { let y = 3;
>> }
>>
>> "If x is 5, then let y equal 5." There's an almost inescapably
>> strong suggestion that "let" is a phrasing of the assignment
>> expression, and therefore can't have anything to do with the
>> braces.
>>
>> I think ECMAScript should be easily accessible to us in the
>> unwashed masses. It becomes much more intuitively accessible
>> if it uses a word that strongly implies localness:
>>
>> { if (x == 5)
>> { local y = 3;
>> }
>> local const Debugging = false;
>> for (local Key in List)
>> ++List [Key];
>> }
>>
>> You get plain English sentences that express quite accurately
>> what they're supposed to mean. The programmer won't be the
>> least surprised if a value gets hidden by "local".
>>
>> When people want to write let expressions, if they have to
>> write "local" instead of "let" I don't think this will cause
>> problems. I'm sure the initiate are sophisticated enough that
>> they can adapt to this.
>>
>> Apart from this, I think the scoping arrangements would
>> become significantly simpler and clearer if the language
>> made a very clear, really visible, intuitively accessible
>> distinction between two different types of block, and allowed
>> you to choose either type of block wherever this made sense.
>>
>> My suggestion is to introduce a clearly distinct new and
>> better block. This block should be delimited by {{ and }}
>> if it's at all possible, and I think it is. No keyword,
>> just {{ and }}. This better block would bind vars, consts
>> and functions, just like function scopes do. In fact function
>> scopes and {{ }} would be the same thing, as seen by the
>> programmer.
>>
>> An important advantage with {{ }} is that you can keep
>> everything contained without tedious and error-prone
>> repetition of local (or let) everywhere. And the scoping
>> is prominently visible and clearly structured.
>>
>> It may seem odd that I say that adding yet another scoping
>> construct would make it simpler and easier to learn, but
>> if it's built this way it becomes conceptually clear, free
>> of hidden intricacies, easy to explain. The delimiters
>> {{ and }} suggest that you are walling things in with thicker
>> walls, so that hoisting can't get past. Nicely intuitive.
>> Especially for programmers who put braces at left it becomes
>> very clear indeed. And I'm sure syntax-coloring editors will
>> help making the scoping clear at a glance.
>>
>> The terminology might distinguish between the two types of
>> block by talking about strong and weak blocks, where strong
>> means thick walls that you can't hoist out of, and weak means
>> that the block can only capture things that are marked local
>> (or let), and everything else gets hoisted out.
>>
>> In fact I think a terminology with strong versus weak blocks
>> is clearer than the current terminology, where one type of
>> block is called block and the other is called variable object.
>>
>> I'm sorry if this comes across as a series of complaints. All
>> in all I'm delighted with the many enticing improvements! But
>> listing all the nice things here wouldn't make for interesting
>> reading. And so it may sound much more negative than my overall
>> delighted and enthusiastic feelings.
>>
>
> _______________________________________________
> Es3.x-discuss mailing list
> Es3.x-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es3.x-discuss
>
--
Ingvar von Schoultz
------- (My quirky use of capitals in code comes from my opinion that
reserved and predefined words should all start with lowercase, and
user-defined should all start with uppercase, because this will easily
and elegantly prevent a host of name-collision problems when things
like programming languages are upgraded with new labels.)
More information about the Es4-discuss
mailing list