/freedb

Performance tests of C# vc C++, multi-threaded I/O.

Primary LanguageC#

FreeDB Scanner

This is an exploration into multi threaded techniques using C# and C++.

Why FreeDB?

FreeDB is an online database with CD information. You can download large text archives and this is the sole reason I chose FreeDB. For testing I have used an update archive (which is of reasonable size). You can find them here. The one I used was freedb-update-20080201-20080301.tar.bz2.

The Problem Statement

The problem statement is to find the year when the most number of CDs were published. In order to do so you need to scan a large number of files for lines that look like this:

DYEAR=1992

Do this as quickly as possible, sum up the number of CDs per year, and finally output the year that occurred most often.

Sample Run

Here's a sample run from my C# version:

freedb-cs.exe C:\Users\Daniel\Downloads\freedb-update
Enumerating files...
Found 32229 files
Scanning files...
...Done
Year 2007: 5071
Processing took 1.39 s

And here is the same run from the C++ version:

freedbpp.exe C:\Users\Daniel\Downloads\freedb-update
Enumerating files...
0.88 s

Found 32229 files
Scanning files...
Year 2007: 5071
2.49 s

Challenge

The challenge for you is to try and beat these times. This is an opportunity to learn more about I/O and multicore processing. Let me know if you improve the above results!

Good luck!