This library contains some C++ libraries to validate on-disk representations of Bioconductor objects used in ArtifactDB instances. The idea is to provide a cross-language method for validating the files - which is not quite as useful as a library for reading the files, but it's better than nothing.
See general comments for all objects' on-disk representations.
Currently, takane provides validators for the following objects:
atomic_vector_list
: 1.0.atomic_vector
: 1.0.bam_file
: 1.0.bcf_file
: 1.0.bed_file
: 1.0.bigbed_file
: 1.0.bigwig_file
: 1.0.bumpy_atomic_array
: 1.0.bumpy_data_frame_array
: 1.0.compressed_sparse_matrix
: 1.0.data_frame_factor
: 1.0.data_frame_list
: 1.0.data_frame
: 1.0.dense_array
: 1.0.fasta_file
: 1.0.fastq_file
: 1.0.genomic_ranges_list
: 1.0.genomic_ranges
: 1.0.gff_file
: 1.0.gmt_file
: 1.0.multi_sample_dataset
: 1.0.ranged_summarized_experiment
: 1.0.sequence_information
: 1.0.sequence_string_set
: 1.0.simple_list
: 1.0, 1.1.single_cell_experiment
: 1.0.spatial_experiment
: 1.0, 1.1, 1.2.string_factor
: 1.0.summarized_experiment
: 1.0.vcf_experiment
: 1.0.
The takane::validate()
function inspects the object's directory and validates its contents, throwing an error if the contents are not valid.
#include "takane/takane.hpp"
takane::validate(dir);
The idea is to bind to the takane library in application-specific frameworks, e.g., via R/Python's foreign function interfaces. This consistently enforces the format expectations for each object, regardless of how the saving was performed by each application. For example, we might use the alabaster framework to save Bioconductor objects to disk:
library(alabaster.base)
tmp <- tempfile()
df <- DataFrame(X=1:10, Y=letters[1:10])
saveObject(df, tmp)
validateObject(tmp) # calls takane::validate()
If the validation passes, we can be confident that the same object can be reconstructed in different frameworks, e.g., with dolomite packages in Python.
Check out the reference documentation for more details.
If you're using CMake, you just need to add something like this to your CMakeLists.txt
:
include(FetchContent)
FetchContent_Declare(
takane
GIT_REPOSITORY https://github.com/ArtifactDB/takane
GIT_TAG master # or any version of interest
)
FetchContent_MakeAvailable(takane)
Then you can link to takane to make the headers available during compilation:
# For executables:
target_link_libraries(myexe takane)
# For libaries
target_link_libraries(mylib INTERFACE takane)
You can install the library by cloning a suitable version of this repository and running the following commands:
mkdir build && cd build
cmake .. -DTAKANE_TESTS=OFF
cmake --build . --target install
Then you can use find_package()
as usual:
find_package(artifactdb_takane CONFIG REQUIRED)
target_link_libraries(mylib INTERFACE artifactdb::takane)
If you're not using CMake, the simple approach is to just copy the files in the include/
subdirectory -
either directly or with Git submodules - and include their path during compilation with, e.g., GCC's -I
.
You will also need to link to the dependencies listed in the extern/CMakeLists.txt
directory along with the HDF5 library.
This library is named after Takane Shijou, continuing my trend of naming C++ libraries after iDOLM@STER characters. Not really sure why I picked Takane but she's nice enough.