/scnlib

scanf for modern C++

Primary LanguageC++Apache License 2.0Apache-2.0

scnlib

Linux builds macOS builds Windows builds Other architectures Code Coverage

Latest Release License C++ Standard Documentation

#include <scn/scn.h>
#include <cstdio>

int main() {
    int i;
    // Read an integer from stdin
    // with an accompanying message
    scn::prompt("What's your favorite number? ", "{}", i);
    printf("Oh, cool, %d!", i);
}

// Example result:
// What's your favorite number? 42
// Oh, cool, 42!

What is this?

scnlib is a modern C++ library for replacing scanf and std::istream. This library attempts to move us ever so closer to replacing iostreams and C stdio altogether. It's faster than iostream (see Benchmarks) and type-safe, unlike scanf. Think {fmt} but in the other direction.

This library (or rather, the v2 version of it) is the reference implementation of the ISO C++ standards proposal P1729 "Text Parsing".

The master-branch of the repository is currently in maintenance-only mode, and is unlikely to receive any large updates. The dev-branch targets the next major release, v2.0, which is under active development, and is incompatible with v1.

Documentation

The documentation can be found online, from https://v1.scnlib.dev.

To build the docs yourself, build the doc and doc-sphinx targets generated by CMake. The doc target requires Doxygen, and doc-sphinx requires Python 3.8, Sphinx and Breathe.

Examples

Reading a std::string

#include <scn/scn.h>
#include <iostream>
#include <string_view>

int main() {
    std::string word;
    auto result = scn::scan("Hello world", "{}", word);

    std::cout << word << '\n'; // Will output "Hello"
    std::cout << result.range_as_string() << '\n';  // Will output " world!"
}

Reading multiple values

#include <scn/scn.h>

int main() {
    int i, j;
    auto result = scn::scan("123 456 foo", "{} {}", i, j);
    // result == true
    // i == 123
    // j == 456

    std::string str;
    ret = scn::scan(result.range(), "{}", str);
    // result == true
    // str == "foo"
}

Using the tuple-return API

#include <scn/scn.h>
#include <scn/tuple_return.h>

int main() {
    auto [r, i] = scn::scan_tuple<int>("42", "{}");
    // r is a result object, contextually convertible to `bool`
    // i == 42
}

Error handling

#include <scn/scn.h>
#include <string_view>
#include <iostream>

int main() {
    int i;
    // "foo" is not a valid integer
    auto result = scn::scan("foo", "{}", i);
    if (!result) {
        // i is not touched (still unconstructed)
        // result.range() == "foo" (range not advanced)
        std::cout << "Integer parsing failed with message: " << result.error().msg() << '\n';
    }
}

Features

  • Blazing-fast parsing of values (see benchmarks)
  • Modern C++ interface, featuring type safety (variadic templates), convenience (ranges) and customizability
    • No << chevron >> hell
    • Requires C++11 or newer
  • "{python}"-like format string syntax
  • Optionally header only
  • Minimal code size increase (see benchmarks)
  • No exceptions (supports building with -fno-exceptions -fno-rtti with minimal loss of functionality)
    • Localization requires exceptions, because of the way std::locale is
  • Unicode-aware

Installing

scnlib uses CMake. If your project already uses CMake, integration is easy. First, clone, build, and install the library

# Whereever you cloned scnlib to
$ mkdir build
$ cd build
$ cmake ..
$ make -j
$ make install

Then, in your project:

# Find scnlib package
find_package(scn CONFIG REQUIRED)

# Target which you'd like to use scnlib
# scn::scn-header-only to use the header-only version
add_executable(my_program ...)
target_link_libraries(my_program scn::scn)

Alternatively, if you have scnlib downloaded somewhere, or maybe even bundled inside your project (like a git submodule), you can use add_subdirectory:

add_subdirectory(path/to/scnlib)

# like above
add_executable(my_program ...)
target_link_libraries(my_program scn::scn)

See docs for usage without CMake.

Compiler support

Every commit in master is tested with

  • gcc 7 and newer (until v13)
  • clang 6.0 and newer (until v17)
  • Visual Studio 2019 and 2022

with very extreme warning flags (see cmake/flags.cmake) and with multiple build configurations for each compiler.

The following environments are also tested:

  • 32-bit and 64-bit builds on Windoes
  • AppleClang on macOS 11 (Bit Sur) and 12 (Monterey)
  • clang-cl with VS 2019 and 2022
  • MinGW
  • GCC on armv6, armv7, aarch64, riscv64, s390x, and ppc64le
  • Visual Studio 2022, cross compiling to arm64

Other compilers and compiler versions may work, but it is not guaranteed. If your compiler does not work, it may be a bug in the library. However, support will not be provided for:

  • GCC 4.9 (or earlier): C++11 support is too buggy
  • VS 2015 (or earlier): unable to handle templates

Benchmarks

Run-time performance

Benchmark results

These benchmarks were run on a Ubuntu 21.10 machine running kernel version 5.13.0-30, with an Intel Core i7-8565U processor, and compiled with gcc version 11.2.0, with -O3 -DNDEBUG -march=native. The source code for the benchmarks can be seen in the benchmark directory.

You can run the benchmarks yourself by enabling SCN_BENCHMARKS. SCN_BENCHMARKS is enabled by default if scn is the root CMake project, and disabled otherwise.

$ cd build
$ cmake -DCMAKE_BUILD_TYPE=Release -DSCN_BENCHMARKS=ON -DSCN_USE_NATIVE_ARCH=ON -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON ..
$ make -j
# choose benchmark to run in ./benchmark/runtime/*/bench-*
$ ./benchmark/runtime/integer/bench-int

Times are in nanoseconds of CPU time. Lower is better.

Integer parsing (int)

Test std::stringstream sscanf scn::scan scn::scan_default
Test 1 344 127 65.1 55.3
Test 2 81.2 651 68.3 64.8

Floating-point parsing (double)

Test std::stringstream sscanf scn::scan scn::scan_default
Test 1 612 211 69.5 69.1
Test 2 200 510 83.4 75.3

Reading random whitespace-separated strings

Character type std::stringstream scn::scan scn::scan and string_view
char 63.3 56.9 51.0
wchar_t 157 58.8 62.8

Conclusions

scn::scan is faster than the standard library offerings in all cases, sometimes over 8x faster.

Using scn::scan_default can sometimes have a slight performance benefit over scn::scan.

Test 1 vs. Test 2

In the above comparisons:

  • "Test 1" refers to parsing a single value from a string which only contains the string representation for that value. The time used for constructing parser state is included. For example, the source string could be "123". In this case, a parser is constructed, and a value (123) is parsed. This test is called "single" in the benchmark sources.
  • "Test 2" refers to the average time of parsing a value from a string containing multiple string representations separated by spaces. The time used for constructing parser state is not included. For example, the source string could be "123 456". In this case, a parser is constructed before the timer is started. Then, a single value is read from the source, and the source is advanced to the start of the next value. The time it took to parse a single value is averaged out. This test is called "repeated" in the benchmark sources.

Executable size

Executable size benchmarks test generated code bloat for nontrivial projects. It generates 25 translation units and reads values from stdin five times to simulate a medium sized project. The resulting executable size is shown in the following tables and graphs. The "stripped size" metric shows the size of the executable after running strip.

The code was compiled on Ubuntu 21.10 with g++ 11.2.0. scnlib is linked dynamically to level out the playing field compared to already dynamically linked libc and libstdc++. See the directory benchmark/bloat for more information, e.g. templates for each TU.

To run these tests yourself:

$ cd build
# For Debug
$ cmake -DCMAKE_BUILD_TYPE=Debug -DSCN_BUILD_BLOAT=ON -DSCN_BUILD_BUILDTIME=OFF -DSCN_TESTS=OFF -DSCN_EXAMPLES=OFF -DBUILD_SHARED_LIBS=ON -DSCN_INSTALL=OFF ..
# For Release
$ cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON -DSCN_BUILD_BLOAT=ON -DSCN_BUILD_BUILDTIME=OFF -DSCN_TESTS=OFF -DSCN_EXAMPLES=OFF -DBUILD_SHARED_LIBS=ON -DSCN_INSTALL=OFF ..
# For Minimized Release
$ cmake -DCMAKE_BUILD_TYPE=MinSizeRel -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON -DSCN_BUILD_BLOAT=ON -DSCN_BUILD_BUILDTIME=OFF -DSCN_TESTS=OFF -DSCN_EXAMPLES=OFF -DBUILD_SHARED_LIBS=ON -DSCN_INSTALL=OFF ..

$ make -j
$ ./benchmark/bloat/run-bloat-tests.py ./benchmark/bloat

Sizes are in kibibytes (KiB). Lower is better.

Minimized build (-Os -DNDEBUG)

Method Executable size Stripped size
empty 15.4 14.0
std::scanf 17.0 14.2
std::istream 18.6 14.2
scn::input 18.4 14.2
scn::input (header-only) 120 94.3
scn::scan_value 18.1 14.2
scn::scan_value (header-only) 100 78.3

Benchmark results

Release build (-O3 -DNDEBUG)

Method Executable size Stripped size
empty 15.4 14.0
std::scanf 17.0 14.2
std::istream 18.6 14.2
scn::input 18.2 14.2
scn::input (header-only) 161 138
scn::scan_value 18.6 14.2
scn::scan_value (header-only) 124 106

Benchmark results

Debug build (-g)

Method Executable size Stripped size
empty 27.5 14.0
std::scanf 605 22.2
std::istream 651 26.2
scn::input 1633 94.3
scn::input (header-only) 10533 1010
scn::scan_value 1765 90.3
scn::scan_value (header-only) 9289 698

Benchmark results

Conclusions

When using optimizing build options, scnlib provides equal binary size to <iostream>, and a ~10% increase compared to scanf. If using strip, these differences go away.

On Debug mode, scnlib is ~3x bigger compared to <iostream> and scanf.

Header-only mode makes executable size ~6-7x bigger.

Build time

This test measures the time it takes to compile a binary when using different libraries. Note, that the time it takes to compile the library is not taken into account (unfair measurement against precompiled stdlibs).

These tests were run on an Ubuntu 21.10 machine with an i7-8565U and 40 GB of RAM, using GCC 11.2.0. The compiler flags for a debug build were -g, and -O3 -DNDEBUG for a release build.

To run these tests yourself, enable CMake flag SCN_BUILD_BUILDTIME. In order for these tests to work, c++ must point to a gcc-compatible C++ compiler binary, and a POSIX-compatible /usr/bin/time must be present.

$ cd build
$ cmake -DSCN_BUILD_BUILDTIME=ON ..
$ make -j
$ ./benchmark/buildtime/run-buildtime-tests.sh

Build time

Time is in seconds of CPU time (user time + sys/kernel time). Lower is better.

Method Debug Release
empty 0.07 0.03
scanf 0.20 0.19
std::istream / std::cin 0.26 0.24
scn::input 0.55 0.54
scn::input (header only) 1.88 3.69

Memory consumption

Memory is in mebibytes (MiB). Lower is better.

Method Debug Release
empty 17.4 20.3
scanf 49.1 49.7
std::istream / std::cin 60.8 60.8
scn::input 96.0 92.7
scn::input (header only) 217 247

Conclusions

scnlib takes about 2x longer to compile compared to <iostream>, and uses about 70% more memory.

Header-only mode can make compilation up to 7x slower, and use up to 3x as much memory.

Acknowledgements

The contents of this library are heavily influenced by {fmt} and its derivative works.
https://github.com/fmtlib/fmt

The bundled ranges implementation found from this library is based on NanoRange:
https://github.com/tcbrindle/NanoRange

The default floating-point parsing algorithm used by this library is implemented by fast_float:
https://github.com/fastfloat/fast_float

The Unicode-related parts of this library are based on utfcpp:
https://github.com/nemtrif/utfcpp

The design of this library is also inspired by the Python parse library:
https://github.com/r1chardj0n3s/parse

License

scnlib is licensed under the Apache License, version 2.0.
Copyright (c) 2017 Elias Kosunen
See LICENSE for further details

See the directory licenses/ for third-party licensing information.