/h5cpp-compiler

h5cpp-compiler

Primary LanguageC++

Source code transformation tool for HDF5 dataformat H5CPP header only library

This source code transformation tool simplifies the otherwise time consuming process of generating the shim code for HDF5 Compound datatypes by building the AST of a given TU translation unit, and identifying all POD datatypes referenced from H5CPP operators/functions. The result is a seamless persistence much similar to python, java or other reflection based languages.

The following excerpt shows the mechanism, how vec variable is marked by h5::write operator. When h5cpp tool is invoked it builds the full AST of the translation unit, finds the referenced types, then in topological order generates HDF5 COMPOUND datatype descriptors. The generated file has include guards, and meant to be used with H5CPP template library. POD struct types may be arbitrary deep, embedded in POD C like arrays, and may be referenced from STL containers. Currently stl::vector is supported, but in time full support will be provided.

...
std::vector<sn::example::Record> vec 
    = h5::utils::get_test_data<sn::example::Record>(20);
// mark vec  with an h5:: operator and delegate 
// the details to h5cpp compiler
h5::write(fd, "orm/partial/vector one_shot", vec );
...

// some include files with complex POD types, embedded in arbitrary name space
namespace sn {
	namespace typecheck {
		struct Record { /*the types with direct mapping to HDF5*/
			char  _char; unsigned char _uchar; short _short; unsigned short _ushort; int _int; unsigned int _uint;
			long _long; unsigned long _ulong; long long int _llong; unsigned long long _ullong;
			float _float; double _double; long double _ldouble;
			bool _bool;
			// wide characters are not supported in HDF5
			// wchar_t _wchar; char16_t _wchar16; char32_t _wchar32;
		};
	}
	namespace other {
		struct Record {                    // POD struct with nested namespace
			MyUInt                    idx; // typedef type 
			MyUInt                     aa; // typedef type 
			double            field_02[3]; // const array mapped 
			typecheck::Record field_03[4]; //
		};
	}
	namespace example {
		struct Record {                    // POD struct with nested namespace
			MyUInt                    idx; // typedef type 
			float             field_02[7]; // const array mapped 
			sn::other::Record field_03[5]; // embedded Record
			sn::other::Record field_04[5]; // must be optimized out, same as previous
			other::Record  field_05[3][8]; // array of arrays 
		};
	}
	namespace not_supported_yet {
		// NON POD: not supported in phase 1
		// C++ Class -> PODstruct -> persistence[ HDF5 | ??? ] -> PODstruct -> C++ Class 
		struct Container {
			double                            idx; // 
			std::string                  field_05; // c++ object makes it non-POD
			std::vector<example::Record> field_02; // ditto
		};
	}
	/* BEGIN IGNORED STRUCT */
	// these structs are not referenced with h5::read|h5::write|h5::create operators
	// hence compiler should ignore them.
	struct IgnoredRecord {
		signed long int   idx;
		float        field_0n;
	};
	/* END IGNORED STRUCTS */

Install:

Only LLVM 6.0 is supported, to compile from source you need both the llvm and clang-dev package installed:

sudo apt install llvm-6.0 llvm-6.0-dev libclang-6.0-dev  # 640MB space needed
make && make install                                     # compile the source code transforation tool

optionally you can remove the development libraries, and install only the runtime

sudo apt purge llvm-6.0 libllvm-6.0-dev libclang-6.0-dev # remove development libraries
sudo apt install libllvm6.0 libclang-common-6.0-dev      # install runtime dependencies

Caveat:

All LLVM version other than 6.0 is failing, or crashing including the clang++ chain. This is being investigated, and once resolved this message will be removed.

Usage:

h5cpp your_translation_unit.cpp -- -v $(CXXFLAGS) -Dgenerated.h will run the compiler front end on the specified input, and outputs the necessary HDF5 type descriptors, or the error message if any.