CouchDB Bulk Load Performance
Yesterday I wrote a very long (sorry!) post about my technique for bulk loading data into CouchDB. I didn't want to make that post any longer, so today I'm going to talk about how well the bulk loading performs. All of these numbers come from my machine which is a Core2 Quad Q9000 @ 2Ghz with 8GB RAM and a single 7200 RPM drive.
As you may recall, the main problem I had was getting too much stuff in memory and having everything blow out with an out of memory exception. The solution I posted yesterday keeps memory under control and seems to keep all four processors busy. Here's a screenshot of task manager while it is running.
Armed with a stopwatch and the CouchDB web interface, I did some crude timings. I may go back at some point and wrap my loader with some timing code so that I can generate some minute-by-minute graphs, but this will do for now.
| Elapsed Time (minutes) | Database Size (gigabytes) | Document Count |
|---|---|---|
| 1 | .33 | 19,000 |
| 2 | .7 | 49,000 |
| 3 | 1.0 | 89,000 |
| 4 | 1.4 | 134,000 |
| 5 | 1.8 | 181,000 |
| 10 | 3.6 | 503,000 |
| 15 | 5.1 | 843,000 |
| 20 | 6.7 | 1,201,000 |
| 25 | 7.9 | 1,473,000 |
| 30 | 9.1 | 1,759,000 |
I was really impressed with the numbers! After 30 minutes, I was averaging 977 documents/second and and 5.2 megabytes/second. Keep in mind this is all running local on my machine, but the numbers sure are encouraging.
Comments(0)