Linux x86_64 | Windows x86_64 | MacOs x86_64 | MacOs ARM64 |
---|---|---|---|
Linux x86_64 | Windows x86_64 | MacOs x86_64 |
---|---|---|
A NASA's CDF modern C++ library. This is not a C++ wrapper but a full C++ implementation. Why? CDF files are still used for space physics missions but few implementations are available. The main one is NASA's C implementation available here but it lacks multi-threads support (global shared state), has an old C interface and has a license which isn't compatible with most Linux distributions policy. There are also Java and Python implementations which are not usable in C++.
List of features and roadmap:
- CDF reading
- read files from cdf version 2.2 to 3.x
- read uncompressed file headers
- read uncompressed attributes
- read uncompressed variables
- read variable attributes
- loads cdf files from memory (std::vector or char*)
- handles both row and column major files
- read variables with nested VXRs
- read compressed files (GZip, RLE)
- read compressed variables (GZip, RLE)
- read UTF-8 encoded files
- read ISO 8859-1(Latin-1) encoded files (converts to UTF-8 on the fly)
- variables values lazy loading
- decode DEC's floating point encoding (Itanium, ALPHA and VAX)
- pad values
- CDF writing
- write uncompressed headers
- write uncompressed attributes
- write uncompressed variables
- write compressed variables
- write compressed files
- pad values
- General features
- uses libdeflate for faster GZip decompression
- highly optimized CDF reads (up to ~4GB/s read speed from disk)
- handle leap seconds
- Python wrappers
- Documentation
- Examples (see below)
- Benchmarks
If you want to understand how it works, how to use the code or what works, you may have to read tests.
python3 -m pip install --user pycdfpp
meson build
cd build
ninja
sudo ninja install
Or if youl want to build a Python wheel:
python -m build .
# resulting wheel will be located into dist folder
Basic example from a local file:
import pycdfpp
cdf = pycdfpp.load("some_cdf.cdf")
cdf_var_data = cdf["var_name"].values #builds a numpy view or a list of strings
attribute_name_first_value = cdf.attributes['attribute_name'][0]
Note that you can also load in memory files:
import pycdfpp
import requests
import matplotlib.pyplot as plt
tha_l2_fgm = pycdfpp.load(requests.get("https://spdf.gsfc.nasa.gov/pub/data/themis/tha/l2/fgm/2016/tha_l2_fgm_20160101_v01.cdf").content)
plt.plot(tha_l2_fgm["tha_fgl_gsm"])
plt.show()
Buffer protocol support:
import pycdfpp
import requests
import xarray as xr
import matplotlib.pyplot as plt
tha_l2_fgm = pycdfpp.load(requests.get("https://spdf.gsfc.nasa.gov/pub/data/themis/tha/l2/fgm/2016/tha_l2_fgm_20160101_v01.cdf").content)
xr.DataArray(tha_l2_fgm['tha_fgl_gsm'], dims=['time', 'components'], coords={'time':tha_l2_fgm['tha_fgl_time'].values, 'components':['x', 'y', 'z']}).plot.line(x='time')
plt.show()
# Works with matplotlib directly too
plt.plot(tha_l2_fgm['tha_fgl_time'], tha_l2_fgm['tha_fgl_gsm'])
plt.show()
Datetimes handling:
import pycdfpp
import os
# Due to an issue with pybind11 you have to force your timezone to UTC for
# datetime conversion (not necessary for numpy datetime64)
os.environ['TZ']='UTC'
mms2_fgm_srvy = pycdfpp.load("mms2_fgm_srvy_l2_20200201_v5.230.0.cdf")
# to convert any CDF variable holding any time type to python datetime:
epoch_dt = pycdfpp.to_datetime(mms2_fgm_srvy["Epoch"])
# same with numpy datetime64:
epoch_dt64 = pycdfpp.to_datetime64(mms2_fgm_srvy["Epoch"])
# note that using datetime64 is ~100x faster than datetime (~2ns/element on an average laptop)
Creating a basic CDF file:
import pycdfpp
import numpy as np
from datetime import datetime
cdf = pycdfpp.CDF()
cdf.add_attribute("some attribute", [[1,2,3], [datetime(2018,1,1), datetime(2018,1,2)], "hello\nworld"])
cdf.add_variable(f"some variable", values=np.ones((10),dtype=np.float64))
pycdfpp.save(cdf, "some_cdf.cdf")
#include "cdf-io/cdf-io.hpp"
#include <iostream>
std::ostream& operator<<(std::ostream& os, const cdf::Variable::shape_t& shape)
{
os << "(";
for (auto i = 0; i < static_cast<int>(std::size(shape)) - 1; i++)
os << shape[i] << ',';
if (std::size(shape) >= 1)
os << shape[std::size(shape) - 1];
os << ")";
return os;
}
int main(int argc, char** argv)
{
auto path = std::string(DATA_PATH) + "/a_cdf.cdf";
// cdf::io::load returns a optional<CDF>
if (const auto my_cdf = cdf::io::load(path); my_cdf)
{
std::cout << "Attribute list:" << std::endl;
for (const auto& [name, attribute] : my_cdf->attributes)
{
std::cout << "\t" << name << std::endl;
}
std::cout << "Variable list:" << std::endl;
for (const auto& [name, variable] : my_cdf->variables)
{
std::cout << "\t" << name << " shape:" << variable.shape() << std::endl;
}
return 0;
}
return -1;
}
- NRV variables shape, in order to expose a consistent shape, PyCDFpp exposes the reccord count as first dimension and thus its value will be either 0 or 1 (0 mean empty variable).