[rust-dev] First thoughts on Rust

Masklinn masklinn at masklinn.net
Sun Jan 22 03:11:41 PST 2012


I guess this would be the least-bad mailing list for it
considering the other one is -commits.

With Rust 0.1, I finally decided to actually test Rust (kudos for
a perfect compilation under OSX, by the way, no issue whatsoever
to report), and have thus encountered interrogations, dislikes,
or issues. And am thus reporting them here. Warning: I only spent
about half an hour or an hour with rust.

I'm aware some of them might have been discussed to death
beforehand and can have rationales, a simple link to these
discussions/rationales should be sufficient of course.


Interrogations:
---------------

* Are there plans for a CLI? Maybe not a full-blown REPL, but a
  basic execution console which would let users try out snippets
  without having to create and compile files


* In the use-crate specification, are attributes arbitrary and
  always exactly matched? My concerns are mostly with the `vers`
  attribute of the examples, as strict equality is
  sometimes (often) not what is wanted, with richer/more flexible
  specifications being useful (as well as more than one spec)
  e.g. vers >= 1.5, vers < 2
  - Linked to this, if some attributes have special status does
    Rust enforce a specific style for version numbers
    e.g. semver?


* Are python-style chained comparison operators (e.g. `3 < a < 5`
  being true iif a is in ]3.0, 5.0[) planned? They don't seem to
  be possible right now, the example above fails to typecheck.


* Do the rustc &al libraries provide services to third-parties
  which would allow for easily building e.g. analysis or
  transformation tools (such as refactoring utilities)? Would
  they give AST-level access to third-party tools such as the
  Rust emacs mode or an IDE? I did not see any hint to
  documentation of these libraries and their services.


Issues:
-------

* The first one is the apparent (community) usage of "blocks" for
  Rust's boxed closures[0]. My issue with this is that languages where
  blocks are first-class objects (Smalltalk, Ruby) default to
  non-local returns from these blocks. Rust does not — as far as I can
  tell — have — let alone use — non-local returns.

  Using "block" for boxed closures does everybody a disservice as it
  makes transition much harder and *will* disappoint people used to
  actual smalltalk-type blocks. The tutorial does not have this issue,
  which is good, but the community should be careful. Talking about
  lambdas or sugared lambdas would probably be a good idea (unless
  Rust is modified to handle and default to non-local returns from
  stack closures)


* The second issue is both trivial and extremely serious: after
  having written a few trivial pieces of code (can't call them
  programs), it feels like Rust's handling of semicolons combines
  the simplicity of Erlang's with the determinism of
  Javascript's.

  I think the core issue is that Rust uses semicolons as expression
  separators where most if not all other braceful languages use them
  as expression terminators. I know the rules are simple to express,
  but Erlang demonstrated it would *never* feel right to people in the
  current state of the language. Erlang also has the issue of three
  different separators, but Rust has the additional issue that a
  semicolon becomes the difference between returning a value and
  returning unit. That's not right.

  I see three possible ways to fix this:

  - Don't change Rust's semantics, but change Rust's "blessed" style,
    by prefixing expressions with semicolons (instead of
    post-fixing). This is a common style for e.g. RelaxNG-Compact
    schemas and it looks "correct" for separators

  - Make semicolons into expression terminators, as in the majority of
    C-style languages

  - Add haskell-like layout rules making semicolons redundant in 99.9%
    of cases, and necessary only when putting multiple expressions on
    a line or when generating rust code mechanically. This would fix
    the issue by making semicolons generally unnecessary and thus not
    a source of error


* Strings. I believe Rust's current state of conflating byte sequences
  and human-text strings to be as big a mistake as it was in Python.

  If Rust wants to be mainly bytes-oriented the `str` type should be
  renamed `bytes` and most string-manipulation functions should be
  removed.

  Otherwise, I believe it should be split in two clearly separate
  types, one manipulating sequences or arrays of bytes and the other
  one manipulating streams of unicode codepoints (for which encoding
  would be an implementation detail).

  I think the current setting will hurt both Rust and its users in the
  long term. One of the first problems being that a byte sequence
  advertising itself as UTF-8 and actually UTF-8 text have different
  properties and overlong UTF-8 (which can be found in byte sequences)
  is an actual security issue[1]. Bytes to string *should* require
  transcoding, even if that transcoding does nothing more than an O(n)
  shortest-form assertion because the string type's underlying
  representation is UTF-8 anyway.

  Furthermore:

  - A string type's primary interface should be about graphemes, not
    codepoints and especially not bytes. And on this I fully agree
    with Graydon Hoare[1] that O(1) random access is neither necessary
    nor helpful.

    While it is not its primary interface, NSString is the only string
    type I know of which comes close to being good there. In fact
    overall NSString is the string type closest to providing a good
    interface for text, as far as I'm concerned. This also makes
    issues of e.g. normalization issues easier to handle.

    I am aware it makes string types more complex to have them be
    "unicode-correct", as many unicode operations rely on locale. But
    I also believe this to be necessary, I don't believe "works for
    english speakers" to be sufficient anymore.

  - A bytes type should have a literal form, I would think this a more
    useful usage for single quotes than the current `char`
    literals. Having erlang-style bytes pattern matching/destructuring
    would be a good usage of bytes having their own literals when it
    comes to protocol-parsing.


* I find strange that Rust has kept the ternary operator even though
  its conditionals are expressions of their own.


* Finally, I could not find any good information on the result of loop
  expressions, from my tests it seems to always be `()` is that
  correct? If so, why `()` rather than the last-evaluated result of
  the iteration? In case no iteration at all is performed?

Thank you for reading until this point, looking forward to any reply
and have a nice day.

[0] https://mail.mozilla.org/pipermail/rust-dev/2012-January/001168.html
[1] http://www.unicode.org/reports/tr36/#UTF-8_Exploit
[2] https://mail.mozilla.org/pipermail/rust-dev/2012-January/001196.html


More information about the Rust-dev mailing list