CouchDB Bulk Load Performance

Yesterday I wrote a very long (sorry!) post about my technique for bulk loading data into CouchDB. I didn't want to make that post any longer, so today I'm going to talk about how well the bulk loading performs. All of these numbers come from my machine which is a Core2 Quad Q9000 @ 2Ghz with 8GB RAM and a single 7200 RPM drive.

As you may recall, the main problem I had was getting too much stuff in memory and having everything blow out with an out of memory exception. The solution I posted yesterday keeps memory under control and seems to keep all four processors busy. Here's a screenshot of task manager while it is running.

Armed with a stopwatch and the CouchDB web interface, I did some crude timings.  I may go back at some point and wrap my loader with some timing code so that I can generate some minute-by-minute graphs, but this will do for now.

Elapsed Time (minutes) Database Size (gigabytes) Document Count
1 .33 19,000
2 .7 49,000
3 1.0 89,000
4 1.4 134,000
5 1.8 181,000
10 3.6 503,000
15 5.1 843,000
20 6.7 1,201,000
25 7.9 1,473,000
30 9.1 1,759,000

I was really impressed with the numbers! After 30 minutes, I was averaging 977 documents/second and and 5.2 megabytes/second. Keep in mind this is all running local on my machine, but the numbers sure are encouraging.