Suggestions to triple quoted strings proposal

Stepan Koltsov stepan.koltsov at gmail.com
Thu Dec 14 22:15:53 PST 2006


Hi, again.

I've looked in sources of Python itself (checked out from
http://svn.python.org/projects/python/trunk). Possibly, nobody writes
in Python "better" then Python developers.

I've written script that counts usages of multiline strings in python source.

(Script is actually a Java program. I code in Java 10 hours a day, I
do it really fast :)

Of course, most TQS in Python are used as docstrings (and doctests).
There are 8214 multiline strings in Python sources.

First, I though about first newline after TQS.

There are 907 uses of multiline strings (that are not docstrings) in
Python sources. Only 1/9 of multiline strings store data.

In 368 cases among 907, starting triple quotes followed by backslash
and newline.

=== real example from Doc/lib/minidom-example.py
document = """\
<slideshow>
...
"""
===

It is more then 1/3.

In 342 cases among 907, starting triple quote followed by newline.


I have no numbers that show that leading spaces should be stripped by
lexer. I don't know what to measure. I can show the extraction from
sources:

http://mx1.ru/~yozh/js2/nds-indent.txt

This file contains real-world examples of data stored inside multiline
strings, where statements declared inside some blocks. I can repeat,
code looks dirty.

Also I have file

http://mx1.ru/~yozh/js2/nds.txt

Contains all fragments with TQS that are not docstrings.

Script sources can be found at

http://mx1.ru/~yozh/js2/dig-tqs.zip

(you can look inside, if you think that my script produced wrong files)

On 12/14/06, Brendan Eich <brendan at mozilla.org> wrote:
> On Dec 13, 2006, at 10:59 AM, Stepan Koltsov wrote:
>
> > Brendan, or anybody else who wants multiline strings should to behave
> > like in Python,
> >
> > Could you please write complex-enough example of code with TQS? In
> > that example string constant should be declared inside method inside
> > class. There is no good example at
> > http://developer.mozilla.org/es4/proposals/triple_quotes.html .
>
> You're right there's no good example, but the Python docs have
> examples,

BTW, Python docs has no good examples of multiline strings.

Language reference has no example. Python tutorial has something...
ehh... not nice:

print """
Usage: thingy [OPTIONS]
     -h                        Display this usage message
     -H hostname               Hostname to connect to
"""

this prints text surrounded by empty lines (first -- because of
leading newline, last -- because print stmt adds own newline).

> and real code has even more compelling examples. Two
> arguments here:
>
> 1.  "Be like Python, reuse brainprint from JS hackers who know Python
> and Python hackers learning JS".  This is non-trivial.  It's not just
> "marketing".  It makes the world better to avoid defining """
> differently in ES4/JS2 from Python.
>
> 2.  "Be like Python, stand on its shoulders and reuse the experience
> that informed its design decisions and defaults".  This is certainly
> a gamble, since JS is not Python, and Python ain't perfect (JS is far
> from perfect).  But with some care (e.g., eliminating GeneratorExit
> in the JS Pythonic generators available now in Firefox 2, and going
> into ES4), it can pay off.  There's probably value here, unless
> Python has failed to heed negative feedback on non-stripping """.

BTW, there were no design decisions when Guido developed first version
of Pyton 15 years ago as a "hobby" programming project (quote from
Wikipedia).

Long time ago I asked Python developers about their interpretation of
multiline strings. And they answered that behaviour is proper, and
even if it was not proper, it is too late to change it.

> 3.  Quote means verbatim contents modulo escapes and special case for
> embedded newlines, i.e. literal.  Trimming or stripping does not fit
> under the notion of "literal".  Bob and I have made this point, it's
> about intuition more than optimizing for the common case.
>
> > I used to write in Python, I hated its """ behaviour. I asked people
> > who use Python, and they generally agreed with me.
>
> Were they writing doc strings or data? We have http://
> developer.mozilla.org/es4/proposals/documentation.html for
> documentation, that is, Java doc-comments with simpler embedded
> "markup" syntax.

I asked about data. I think, documentation format is not very important.

Personally, I prefer javadoc/doxygen style over docstrings.

> > I'm afraid, that if you keep TQS "simple", they won't be very usable:
> > in 99% cases users will be forced to manually strip spaces and leading
> > newline. In 1% cases string constant will be defined outside block, or
> > amount of spaces will not matter.
> >
> > I have no other arguments :)
>
> This is the crux of the matter.  My counter-argument is number 2,
> above.  If your Python experience were more common, something would
> have been done.  But I could be wrong.
>
> Can you say more about what these """ strings contained in Python
> (doc vs. data, etc.).  More context, real examples?

Any questions?

--
Stepan



More information about the Es4-discuss mailing list