/HDF5.jl

Save and load data in the HDF5 file format from Julia

Primary LanguageJuliaMIT LicenseMIT

HDF5.jl

HDF5 interface for the Julia language

Stable Build Status Coverage

HDF5 is a file format and library for storing and accessing data, commonly used for scientific data. HDF5 files can be created and read by numerous programming languages. This package provides an interface to the HDF5 library for the Julia language.

Changelog

Please see HISTORY.jl for the changelog. Most changes have deprecation warnings and thus may not be listed in this file.

Installation

julia>]
pkg> add HDF5

Starting from Julia 1.3, the HDF5 binaries are by default downloaded using the HDF5_jll package.

To use system-provided HDF5 binaries instead, set the environment variable JULIA_HDF5_PATH to the top-level installation directory HDF5, i.e. the library should be located in ${JULIA_HDF5_PATH}/lib. Then run Pkg.build("HDF5"). In particular, this is required if you need parallel HDF5 support, which is not provided by the HDF5_jll binaries.

For example, to use HDF5 (libhdf5-mpich-dev) with MPI using system libraries on Ubuntu 20.04, you would run:

$ sudo apt install mpich libhdf5-mpich-dev
$ JULIA_HDF5_PATH=/usr/lib/x86_64-linux-gnu/hdf5/mpich/
$ JULIA_MPI_BINARY=system

Then in Julia, run:

pkg> build

Quickstart

Begin your code with

using HDF5

To read and write a variable to a file, one approach is to use the filename:

A = collect(reshape(1:120, 15, 8))
h5write("/tmp/test2.h5", "mygroup2/A", A)
data = h5read("/tmp/test2.h5", "mygroup2/A", (2:3:15, 3:5))

where the last line reads back just A[2:3:15, 3:5] from the dataset.

More fine-grained control can be obtained using functional syntax:

h5open("mydata.h5", "w") do file
    write(file, "A", A)  # alternatively, say "@write file A"
end

c = h5open("mydata.h5", "r") do file
    read(file, "A")
end

This allows you to add variables as they are generated to an open HDF5 file. You don't have to use the do syntax (file = h5open("mydata.h5", "w") works just fine), but an advantage is that it will automatically close the file (close(file)) for you, even in cases of error.

Julia's high-level wrapper, providing a dictionary-like interface, may also be of interest:

using HDF5

h5open("test.h5", "w") do file
    g = create_group(file, "mygroup") # create a group
    g["dset1"] = 3.2                  # create a scalar dataset inside the group
    attributes(g)["Description"] = "This group contains only a single dataset" # an attribute
end

Convenience functions for attributes attached to datasets are also provided:

A = Vector{Int}(1:10)
h5write("bar.h5", "foo", A)
h5writeattr("bar.h5", "foo", Dict("c"=>"value for metadata parameter c","d"=>"metadata d"))
h5readattr("bar.h5", "foo")

Specific file formats

There is no conflict in having multiple modules (HDF5, JLD, and MAT) available simultaneously; the formatting of the file is determined by the open command.

Complete documentation

The HDF5 API is much more extensive than suggested by this brief introduction. More complete documentation is found in the documentation.

The test directory contains a number of test scripts that also demonstrate usage.

Credits

  • Konrad Hinsen initiated Julia's support for HDF5

  • Tim Holy and Simon Kornblith (co-maintainers and primary authors)

  • Tom Short contributed code and ideas to the dictionary-like interface

  • Blake Johnson made several improvements, such as support for iterating over attributes

  • Isaiah Norton and Elliot Saba improved installation on Windows and OSX

  • Steve Johnson contributed the do syntax and Blosc compression

  • Mike Nolta and Jameson Nash contributed code or suggestions for improving the handling of HDF5's constants

  • Thanks also to the users who have reported bugs and tested fixes