Yesterday I wrote a very long (sorry!) post about my technique for bulk loading data into CouchDB. I didn't want to make that post any longer, so today I'm going to talk about how well the bulk loading performs. All of these numbers come from my machine which is a Core2 Quad Q9000 @ 2Ghz with 8GB RAM and a single 7200 RPM drive.
As you may recall, the main problem I had was getting too much stuff in memory and having everything blow out with an out of memory exception. The solution I posted yesterday keeps memory under control and seems to keep all four processors busy. Here's a screenshot of task manager while it is running.
Armed with a stopwatch and the CouchDB web interface, I did some crude timings. I may go back at some point and wrap my loader with some timing code so that I can generate some minute-by-minute graphs, but this will do for now.
|Elapsed Time (minutes)||Database Size (gigabytes)||Document Count|
I was really impressed with the numbers! After 30 minutes, I was averaging 977 documents/second and and 5.2 megabytes/second. Keep in mind this is all running local on my machine, but the numbers sure are encouraging.