[Cs254f11] slurp

Lee Spector lspector at hampshire.edu
Tue Oct 25 09:01:45 EDT 2011


[cc-ing the class since this may be of broader interest]

The slowness here seems to be the clooj output pane, which just doesn't seem to be good at spewing large volumes of data (maybe because code it uses to format output wasn't written with large volumes in mind).

When I do:

(def testsong (slurp "/Users/leespector/Code/clojure/play/src/LovelyRita.csv"))

(first testsong)

(last testsong)

each of these returns instantly. The "last" call was to ensure that the whole thing had been read (even though that should be the case according to the spec because slurp returns a string, which isn't lazy -- but I wanted to make sure).

The delay comes when you try to print the value of testsong to the output pane, and since I don't think you really want to ever do this anyway I don't think this should be a problem. Just don't do it :-).

BTW if I do all of this from a leiningen repl then the spewing does take some time -- there's a fair amount of data there -- but it starts spewing instantly and it doesn't get all gummed up like clooj does.

 -Lee



On Oct 24, 2011, at 2:45 PM, Wm. Josiah Erikson wrote:

> So I've attached the file I'm slurping. Tell me if you get different results. If I do something very very simple like:
> 
> (def testsong (slurp "/Users/josiah/Documents/cs254/LovelyRita.csv"))
> 
> and then type "testsong" into the REPL, it takes forever, sometimes as much as 2 minutes, before returning.
> 
> However, if I use the mmaped version:
> 
> (require '[clojure.contrib.mmap :as mmap])
> (def testsong (mmap/slurp "/Users/josiah/Documents/cs254/LovelyRita.csv"))
> 
> and then type "testsong" into the REPL, it takes oh, 3-5 seconds.
> 
> Said file is under 1MB, and the real thing I want to do with it is:
> 
> (require '[clojure.contrib.string :as string])
> 
> (defn parse_song
> "Takes a filename, assumed to be a MIDI file converted to .csv format, and returns a vector of lists.
>  Each list has a number of elements, split by the comma in the .csv"
> [file]
> (vec (map #(string/split #"," %) (string/split-lines (mmap/slurp file)))))
> 
> (def song (parse_song "/Users/josiah/Documents/cs254/LovelyRita.csv"))
> 
> Then if you type "song" into the REPL, you get a long delay as well, before it spits out the answer. Maybe it's just the REPL? I'm going to keep coding anyway.
> 
> Before I found the mmapped version of slurp and was using a definition of parse_song with the normal one, parse_song would sometimes never return at all, or at least not before I ran out of patience and killed clooj :)
> 
> 
> -- 
> -----
> Wm. Josiah Erikson
> Network Engineer
> Hampshire College
> Amherst, MA 01002
> 

--
Lee Spector, Professor of Computer Science
Cognitive Science, Hampshire College
893 West Street, Amherst, MA 01002-3359
lspector at hampshire.edu, http://hampshire.edu/lspector/
Phone: 413-559-5352, Fax: 413-559-5438



More information about the Cs254f11 mailing list