Copyright(c) 2014 Milo Yip (miloyip@gmail.com)
This benchmark evaluates the performance of conversion from double precision IEEE-754 floating point (double
) to ASCII string. The function prototype is:
void dtoa(double value, char* buffer);
The character string result must be convertible to the original value exactly via some correct implementation of strtod()
, i.e. roundtrip convertible.
Note that dtoa()
is not a standard function in C and C++.
Firstly the program verifies the correctness of implementations.
Then, one case for benchmark is carried out:
- RandomDigit: Generates 1000 random
double
values, filtered out+/-inf
andnan
. Then convert them to limited precision (1 to 17 decimal digits in significand). Finally convert these numbers into ASCII.
Each digit group is run for 100 times. The minimum time duration is measured for 10 trials.
- Obtain premake4.
- Copy premake4 executable to
dtoa-benchmark/build
folder (or system path). - Run
premake.bat
orpremake.sh
indtoa-benchmark/build
- On Windows, build the solution at
dtoa-benchmark/build/vs2008/
or/vs2010/
. - On other platforms, run GNU
make config=release32
(orrelease64
) atdtoa-benchmark/build/gmake/
- On success, run the
dtoa
executable is generated atdtoa-benchmark/
- The results in CSV format will be written to
dtoa-benchmark/result
. - Run GNU
make
indtoa-benchmark/result
to generate results in HTML.
The following are sequential
results measured on a PC (Core i7 920 @2.67Ghz), where u32toa()
is compiled by Visual C++ 2013 and run on Windows 64-bit. The speedup is based on sprintf()
.
Function | Time (ns) | Speedup |
---|---|---|
ostringstream | 2,778.748 | 0.45x |
ostrstream | 2,628.365 | 0.48x |
gay | 1,646.310 | 0.76x |
sprintf | 1,256.376 | 1.00x |
fpconv | 273.822 | 4.59x |
grisu2 | 220.251 | 5.70x |
doubleconv | 201.645 | 6.23x |
milo | 138.021 | 9.10x |
null | 2.146 | 585.58x |
Note that the null
implementation does nothing. It measures the overheads of looping and function call.
Some results of various configurations are located at dtoa-benchmark/result
. They can be accessed online, with interactivity provided by Google Charts:
- corei7920@2.67_win32_vc2013
- corei7920@2.67_win64_vc2013
- corei7920@2.67_cygwin32_gcc4.8
- corei7920@2.67_cygwin64_gcc4.8
Function | Description |
---|---|
ostringstream | std::ostringstream in C++ standard library with setprecision(17) . |
ostrstream | std::ostrstream in C++ standard library with setprecision(17) . |
sprintf | sprintf() in C standard library with "%.17g" format. |
gay | David M. Gay's dtoa() C implementation. |
grisu2 | Florian Loitsch's Grisu2 C implementation [1]. |
doubleconv | C++ implementation extracted from Google's V8 JavaScript Engine with EcmaScriptConverter().ToShortest() (based on Grisu3, fall back to slower bignum algorithm when Grisu3 failed to produce shortest implementation). |
fpconv | night-shift's Grisu2 C implementation. |
milo | miloyip's Grisu2 C++ header-only implementation. |
null | Do nothing. |
Notes:
-
tostring()
is not tested as it does not fulfill the roundtrip requirement. -
Grisu2 is chosen because it can generate better human-readable number and >99.9% of results are in shortest. Grisu3 needs another
dtoa()
implementation for not meeting the shortest requirement.
-
How to add an implementation?
You may clone an existing implementation file. And then modify it. Re-run
premake
to add it to project or makefile. Note that it will automatically register to the benchmark by macroREGISTER_TEST(name)
.Making pull request of new implementations is welcome.
-
Why not converting
double
tostd::string
?It may introduce heap allocation, which is a big overhead. User can easily wrap these low-level functions to return
std::string
, if needed. -
Why fast
dtoa()
functions is needed?They are a very common operations in writing data in text format. The standard way of
sprintf()
,std::stringstream
, often provides poor performance. The author of this benchmark would optimize thesprintf
implementation in RapidJSON, thus he creates this project.
[1] Loitsch, Florian. "Printing floating-point numbers quickly and accurately with integers." ACM Sigplan Notices 45.6 (2010): 233-243.