[rust-dev] read file line by line

Grahame Bowland grahame at angrygoats.net
Sun Apr 8 21:11:01 PDT 2012


Hi

I've written a CSV reader implementation which handles things like
escaping, quotes, etc - so it's better than a naive character split.
https://github.com/grahame/rust-csv

It's a bit slower than Python's implementation (on the order of 3x). I
found most of the time was in str::from_chars, but since spending some time
speeding that up you're correct - most of the time is spent in allocations.

The Python CSV module is written in C and has a maximum line length limit
of 128K. It's going to be fairly hard to beat, but it's also not doing
exactly the same thing. Also you're not really racing Python and Rust,
you're racing C + a tiny amount of Python and Rust.

Cheers
Grahame

(I haven't written a writer as I don't need one, but it'd be a welcome
addition if someone wants to add one.)

On 9 April 2012 07:52, Mic <mictadlo at gmail.com> wrote:

> Hello,
> Thank you it is working. I created a writing and reading benchmark. In
> both cases Python is about 3 times faster than Rust.
>
> Please find below the results and attached the codes (create_csv.py/rshas to run first, because it creates a csv file which is used for
> csv.py/rs)
>
> *BENCHMARK 1*: Writting 4000000 lines to a file
>
> $ time python csv_create.py
>
> real    0m3.620s
> user    0m1.942s
> sys     0m0.339s
> $ ls -ahl csv.csv
> -rw-r--r-- 1 mictadlo mictadlo 226M Apr  9 09:05 csv.csv
>
>
> $ time ./csv_create
>
> real    0m11.299s
> user    0m3.222s
> sys     0m5.973s
> $ ls -ahl csv.csv
> -rw-r--r-- 1 mictadlo mictadlo 226M Apr  9 09:07 csv.csv
>
>
> *BENCHMARK 2: *Readind a csv file and trim each field
>
> $ time python csv.py
>
> real    0m22.136s
> user    0m21.728s
> sys     0m0.095s
>
> $ time ./csv
>
> real    1m6.796s
> user    1m6.364s
> sys     0m0.145s
>
> If you guys happy benchmark than I could commit it to git. What is the git
> command to commit in 'src/test/bench'?
>
> Thank you in advance.
>
> On Sun, Apr 8, 2012 at 9:13 AM, Brian Anderson <banderson at mozilla.com>wrote:
>
>> **
>> On 04/06/2012 06:48 PM, Mic wrote:
>>
>> Hi
>> I have trouble to compile the following code:
>>
>>  import io::reader_util;
>> import vec::map;
>>
>>  fn main(args: [str]) {
>>
>>      let r = io::file_reader(args[1]); // r is result<reader, err_str>
>>     if result::failure(r) {
>>         fail result::get_err(r);
>>     }
>>
>>      let rdr = result::get(r);
>>
>>      while !rdr.eof() {
>>         let line = rdr.read_line();
>>         io::println(line);
>>         if str::len(line) != 0u {
>>             let parts = vec::map(line.split_char(',')) {|s|
>>                 str::trim(s)
>>             };
>>         }
>>
>>      }
>> }
>>
>>  and got the errors:
>>  $ rustc csv.rs
>> csv.rs:17:33: 17:48 error: attempted access of field split_char on type
>> str, but no public field or method with that name was found
>> csv.rs:17             let parts = vec::map(line.split_char(',')) {|s|
>>
>>  ^~~~~~~~~~~~~~~
>> csv.rs:17:33: 17:53 error: the type of this value must be known in this
>> context
>> csv.rs:17             let parts = vec::map(line.split_char(',')) {|s|
>>
>>  ^~~~~~~~~~~~~~~~~~~~
>>
>>  What did I do wrong?
>>
>>
>> Hi Mic.
>>
>> The available extension methods (as in `line.split_char(',')`) have been
>> changing a lot recently, so my guess is that your compiler is just slightly
>> out of date and doesn't have the `split_char` extension on `str`. Try
>> updating to Rust HEAD where you will also notice that `result::failure` is
>> now called `result::is_failure`.
>>
>> -Brian
>>
>>
>
> _______________________________________________
> Rust-dev mailing list
> Rust-dev at mozilla.org
> https://mail.mozilla.org/listinfo/rust-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/rust-dev/attachments/20120409/d04990ae/attachment.html>


More information about the Rust-dev mailing list