Structurator is a (non-single-header) library which allows to easily map structured file formats to classes. It depends on C++17 and has been tested on compliant versions of GCC, Clang and MSVC. The parsing is fast, as it doesn't build any dynamic structures first.
Features:
- Abitrary nesting of objects without extra code
- Choosing alternative types based on a key's value
- Default values, re-naming and aliases for members
- Validation like range-checking
- Pre-definitions for STL-Containers
- Pre-defined error messages, but you can write your own too
Supported formats:
- JSON
- Your own
All content excluding the Catch2-source is licensed under the BSD-License.
All examples can be found within examples/
.
For every user-defined class, the macro stc_declare_class
must be invoked, either outside within the same namespace, the global namespace or inside the class. Latter has the advanage of accessing private members, former enables you to add support for external classes. In its most basic form, you pass the class' name and its members. Then obtain some input source which implements doc_input
and use from_input
to consume the class. The following complete example reads an object from JSON.
#include <string>
#include <memory>
#include <cassert>
#include <optional>
#include <string_view>
#include <structurator/json_input.hpp> //stc::json::input
#include <structurator/object_mapper.hpp> //stc::from_input
struct my_class
{
int a;
std::string b;
};
stc_declare_class(my_class, a, b); //this is important
int main()
{
std::string_view json_text = R"(
{
"a": 1,
"b": "text"
}
)";
//make input parser (see example 4 for error handling)
auto on_parse_error = [](const stc::json::parse_error&) {};
std::unique_ptr<stc::doc_input> input = stc::json::input(json_text, on_parse_error);
//commence the consuming
auto on_consume_error = [](const stc::doc_error&) {};
std::optional<my_class> my_object = stc::from_input<my_class>(*input, on_consume_error);
assert(my_object.has_value());
assert(my_object->a == 1);
assert(my_object->b == "text");
}
As for now, the inputs don't accept (file-)streams but only string_view
s. In case you want to parse large files, consider using memory-mapped files, for example with mio. For moderately sized files, use read_file("path/to/file")
from input_utilities.hpp
to read everything from an input stream.
Reading behaviour can be altered with flags (member_flags
) and attributes (member_*
). To specify them, make the member declaration a pair of the member's name and its options. Combine multiple flags or attributes with the |
operator.
struct my_class1
{
bool flag = false;
std::map<std::string, int> numbers;
};
struct my_class2
{
std::vector<my_class1> objects;
std::vector<std::string> options;
};
stc_declare_class(my_class1,
//key may be missing, key may be named "f"
(flag, stc::member_flag::maybe_default | stc::member_alias("f")),
//receives unknown keys
(numbers, stc::member_flag::additional_keys)
);
stc_declare_class(my_class2,
objects,
//collects multiple occurences, keys must be named "opt"
(options, stc::member_flag::multiple | stc::member_short("opt"))
);
//...
std::string_view my_class2_json = R"(
{
"objects": [ { "f": true }, { "one": 1, "two": 2, "three": 3 } ],
"opt": "medium",
"opt": "with salami"
}
)";
//...
In many documents, the concrete data type to be read is dynamically specified in advance with a special key. The library allows to map values to types, which are then choosen for reading the actual data. These values themselves can be of any type, usually they are integers or strings. There are two modes to choose from when using the member_alts
attribute:
alt_mode::nest
: The data is put under a separate key.alt_mode::no_nesting
: Use the remaining keys. Only useful for classes.
The type of the member for which alternatives are set can be of any compatible type, in the simplest case it is either std::variant
or std::unique_ptr
with a polymorphic class. For the latter, don't forget that the alternative types must be std::unique_ptr
s too!
using namespace stc;
struct write_entry
{
std::string new_content;
};
struct delete_entry
{
bool immediatley;
};
struct log_entry
{
std::string file_name;
std::string author;
std::uint64_t timestamp;
std::variant<write_entry, delete_entry> payload;
};
stc_declare_class(write_entry, new_content);
stc_declare_class(delete_entry, immediatley);
//MSVC's IntelliSense may color this red for some reason
stc_declare_class(log_entry, file_name, author, timestamp, ( payload,
member_alts("type", alt_mode::nest, //key "type" decides how to proceed
alt<write_entry>("write"), alt<delete_entry>("delete")) )
);
//...
// note that "type" must appear before "payload", but it need not be immediately before it
std::string_view log_entry_json = R"(
{
"file_name": "README.md", "author": "Ben", "timestamp": 1234,
"type": "write",
"payload": { "new_content": "hello there" }
}
)";
/* with alt_mode::no_nesting it should look like this:
{
"file_name": "README.md", "author": "Ben", "timestamp": 1234,
"type": "write",
"new_content": "hello there"
}
*/
//...
The parsers provide error information and try to detect more syntax errors. In this case, from_input
returns an empty optional
. All error objects contain the exact location and a cause in form of an enumeration value. If you don't want to spend time on defining own error messages, there are pre-defined ones available.
#define STC_DEFINE_MESSAGES //define before #includes for built-in strings for enums
#include <structurator/simple_errors.hpp> //stc::error_string
///...
std::string_view json_text = u8R"(
[
{ "name": "Rölf", "height": 180, ???? },
{ "name": "Bert", "height" 170 }
]
)";
auto on_parse_error = [&](const stc::json::parse_error &err)
{
std::cerr << stc::error_string(json_text, err);
};
stc::json::input(json_text, on_parse_error);
//...
/* Output:
Line 3: Expected '"' here.
{ "name": "Rölf", "height": 180, ???? },
---------------------------------------------^
Line 4: Expected ':' here.
{ "name": "Bert", "height" 170 }
---------------------------------------^
*/
When reading objects, type-errors, out-of-bounds and other invalid values can occur. There are some pre-defined classes that wrap types and check their values. Note that multiple validation errors within the same document are not reported (yet).
#define STC_DEFINE_MESSAGES
#include <structurator/validation.hpp>
#include <structurator/size_bounded.hpp>
#include <structurator/range_bounded.hpp>
#include <structurator/simple_errors.hpp>
//...
struct my_validator
{
std::optional<stc::doc_error::kind> operator()(int i) const
{
if(i > 10)
return stc::doc_error::kind::value_too_big;
return std::nullopt;
}
};
struct person
{
stc::size_bounded<std::string, 1, 30> name; //at least one, at most 30 chars
stc::range_bounded<unsigned int, 30, 300> height; //integer between 30 and 300 inclusive
stc::validated_type<int, my_validator> custom; //customized
stc_declare_class(person, name, height, custom);
};
//...
auto on_consume_error = [&](const stc::doc_error &err)
{
std::cerr << stc::error_string(json_text, err);
};
//...
/* Possible output when calling stc::from_input:
Line 3: Value is too small.
{ "name": "Rölf", "height": 0, "custom": 9 }
----------------------------------------^
*/
Adding new input sources is done by implementing doc_input
from doc_input.hpp
. The interface is fairly generic and must traverse the document depth-first.
In case you want your special class to be readable without using the stc_declare_class
macro, write a function consume()
and put it next to your class, so it can be found using argument-dependent lookup:
my_class consume(stc::type_wrap<my_class>, stc::doc_input::token_kind first_token, stc::doc_input &input, stc::doc_context &context)
{
//Use first_token and call input.next_token() to get more tokens.
//In case of errors, call context.error_handler and raise doc_consume_exception.
}
The type of context
is doc_context
by default when using from_input
, but you may derive from it and use your custom context to be passed around with from_input_with_context
.
These are included with object_mapper.hpp
:
- In
native_consumers.hpp
:bool
from boolean (JSON true/false)(unsigned)
short
,int
,long
,long long
+signed char
andunsigned char
from numbers without fractional digitsfloat
,double
,long double
from numbers with possibly fractional digitschar
from a string with exactly one UTF-8 code-unitstc::ref_string
from string
- In
stdlib_consumers.hpp
:std::string
from stringstd::optional<T>
from either T or null (JSON null)std::unique_ptr<T>
from Tstd::array<T, N>
from a list of exactly N elements of type Tstd::vector<T>
from a list of zero or more Tstd::map<K, V>
from key-value mapping of V with K being constructible fromstc::ref_string
- In
object_consumer.hpp
:- Classes T for which the macro
stc_declare_class
was used. This macro basically just defines a function or methodstc_class_info
that returnsstc::class_info
, which then can be used to inspect T.
- Classes T for which the macro
These must be included manually:
- In
validation.hpp
,range_bounded.hpp
andsize_bounded.hpp
:stc::validated_type<T, Validator>
as Tstc::range_bounded<T, Min, Max>
as Tstc::size_bounded<T, Min, Max>
as a container-like T
The type stc::ref_string
is a simple read-only class that contains either just a view of a non-owned string or an allocated, owned string. It is useful for passing strings around without unneccessarily copying it.
stc_declare_class
accepts up to 16 members.- Classes must be default-constructible.
- Custom validation of entire objects is possible with
validated_type
, but there is no way of getting location information for single members. - For now, sub-classes must also present all super-members to
stc_declare_class
. - Only one validation error is reported.
Using CMake, add the library as a subfolder
add_subdirectoy(path/to/CMakeLists.txt)
and link the library to your target:
target_link_libraries(your-project-target PRIVATE structurator)
You can also build and install the library using CMake if you want to permanently share it with other projects on your machine:
mkdir build
cd build
cmake -DSTRUCTURATOR_INSTALL=ON ..
make install
g++ examples/example1.cpp --std=c++17 -lstructurator
In case you don't use CMake, you can just copy the files from src/
and compile them with your project.
There are some more complex tests within tests/
that use Catch2, of which a copy is already in extlib/
. Unless the CMakeLists.txt of this library is the main project, you have to enable tests by setting the CMake-variable STRUCTURATOR_TESTS
to ON
.
Similarily, examples are built when STRUCTURATOR_EXAMPLES
is ON
.
Use doxygen
to build source documentation for the library.
The library uses UTF-8 everywhere. There are a handful of useful utilities in src/utf8.hpp
for your own usage too. However, unicode parsing is very complex. The provided simple functions don't, for example, correctly ignore zero-width code-points when counting the number of characters, but they are good enough for Latin texts. If you insists on correct handling, you can use ICU and extract information for error messages by yourself using the provided byte
index of the provided error object.
- The library uses
constexpr
and templates extensively, so structure information declared withstc_declare_class
is not built or evaluated dynamically. - Documents are not parsed into separate data structures first.
- GCC prior version 11, MSVC prior version 19.24 and Clang don't support
std::from_chars
for floats, sostd::strtof/d/ld
is used, which is slower and might impact performance for documents with lots of floats.