[rust-dev] CSV implementation

Stefan Plantikow stefan.plantikow at googlemail.com
Tue Jan 10 11:08:33 PST 2012


Am Dienstag, 10. Januar 2012 um 17:55 schrieb Grahame Bowland: 
> I've written a CSV reader in Rust. I thought it might be interesting to some of you to see what a day-3 rust programmer comes up with. The code is here:
> https://github.com/grahame/rust-csv/

Kudos and thanks for writing code in rust. 
> It's quite slow - it takes my machine 0.37 seconds to parse a 200kb CSV file. Oddly if I take the print statements on lines 119:122 out, it takes longer - 0.48 seconds. I'm timing with stdout redirected to /dev/null. 
You probably want to avoid allocating row arrays in the inner loop, instead the user could be required to pass in a sufficiently sized array from the outside. Even if you allocate them in the inner loop, try to allocate them with the right size from the start instead of using "+=" as this may cause a complete copy of the vector.

The more deeper issue is that your interface fixes the representation unnecessarily. A possible approach (perhaps using type classes) is to represent a CSV parser as a row iterator of field iterators that allow either getting the field contents as a string or to iterate over its chars. In this way a consumer of the CSV parser retains full control over how input is processed and unnecessary allocation may be avoided.

Somewhat related: There is a ticket for overhauling the json parser and there is some weird overlap between lexing and from_str in libcore which may benefit from a unified, fast, basic way to parse simple literals.


Stefan Plantikow 

More information about the Rust-dev mailing list