Panacea

Demo

Package name: Portable ANalytical data Aggregation and Coordination for database Entry and Access (panacea)

This package is currently a concept. See also the ISC proposal 1.

Panacea is aimed at making data streams in e.g. analytical laboratory settings more transparent and easily accessible. This is needed as closed sourced vendor-supplied software for analytical instruments often act as black boxes, thereby inhibiting access to raw data and full-disclosure of critical processing and transformations.

Panacea would constitute at it most fundamental level a parser for text based data in poorly structured formats (e.g. non-tabular). It detects variable, value, and, optionally, units triplets.

Ultimately, panacea could help establish fully integrated laboratories with centralised data management. Hence this solution contributes to the FAIR [2] guiding principles for data, thereby stimulating innovation, and inclusiveness through open science.

Configure and build panacea

After copying the source files, start the build process of panacea as follows:

./configure

JSON for Modern C++ is needed to enable parsing of results into the JSON format. This can be achieved by cloning the nlohmann/json repository and performing a complete build. Alternatively one can just copy the header file json.hpp to your desired location and do the following during the build process:

./configure CXXFLAGS="-I/path/to/nlohmann/json.hpp"

Note, that the nlohmann/json.hpp is not a breaking build requirement for panacea. Results can alternatively be streamed to the terminal.

Installation

To install panacea use the install target of the makefile, like so:

make install

To customise the installation query the GNU make manual [3].

Basic usage

Currenlty, panacea only has four control options, wich can be modified with the following flags:

--data: for the input text file (e.g. test.txt).
--output: for the output JSON file (e.g. test.json)
--white: the relative white level (a value in between 0 and 1) to detect tables (defaults to 0.7)
--verbose: whether input filename and output results should be streamed to the terminal (defaults to 1)

Example

The source files contain an example text file (extdata/test.txt) to demonstrate the core functionality of panacea.

panacea --data "extdata/test.txt" --output. "extdata/test.json"

Input text file:

Output json file:
Parsed as a table for convenience

Variable	Value	Unit	Text field	Line number	Character number
foo	a,b,c,d		2	5,6,7,8,9	1,1,1,1
bar	5,6,7,8		2	5,6,7,8,9	7,7,7,7
baz	x,y,z,z		2	5,6,7,8,9	13,13,13,13
qux	1,2,3,4	s	2	5,6,7,8,9	19,19,19,19
quz	x,z,,		2	5,6,7,8,9	29,29,32,32
x	42e-3,42e-3		3	12,13,14	1,1
y	4.3e-02,4.3e-02		3	12,13,14	9,9
z	4.4e-01,4.4e-01		3	12,13,14	19,19
numeric	42		4	24	1
numeric	42	um	5	25	1
foo	42		6	26	1
bar	-41		7	26	11
baz	40		8	26	23
foo	42		9	27	1
bar	-41		10	27	11
baz	40		11	27	22
foo	42		12	28	1
bar	-41		13	28	11
baz	40		14	28	23
x	-12761	um	15	29	23
y	-13469	um	16	29	59
z	3709	um	17	29	73
x	1	um	18	30	21
y	2	um	19	30	50
z	3	um	20	30	59
x	4	um	21	30	89
x	-12761	um	22	31	42
y	-13469	um	23	31	97
z	3709	um	24	31	111
numeric	42		25	32	14
LoremIpsum	42	numeric	26	33	1
LoremIpsum	42	numeric	27	34	1
LoremIpsum	42	numeric	28	35	1

References

[2]: Findable, Accessible, Interoperable, and Reusable
[3]: https://www.gnu.org/software/make/manual