uproot (or μproot, for "micro-Python ROOT") is a demonstration of how little is needed to read data from a ROOT file. Only about a thousand lines of Python code can convert ROOT TTrees into Numpy arrays.
It is important to note that uproot is not maintained by the ROOT project team and it is not a fully featured ROOT replacement. Think of it as a file format library, analogous to h5py, parquet-python, or PyFITS. It just reads (and someday writes) files.
Post bug reports in the issues tab here, not on the ROOT forum!
uproot requires only Python and Numpy. Install it with
pip install uproot --user
The other packages listed below are merely recommended (and may be installed at any time in the future), as they unlock special features:
- Python 2.6, 2.7 or 3.4+ (required)
- Numpy 1.4+ (required)
- python-lzma (pip, conda) if you want to read ROOT files compressed with lzma and you're using Python 2 (lzma is part of Python 3's standard library)
- python-lz4 (pip, conda) if you want to read ROOT files compressed with lz4
- python-futures (pip, conda) if you want to read and/or decompress basket data in parallel and you're using Python 2 (futures is part of Python 3's standard library)
- pyxrootd (no pip, conda, source) if you want to access files with XRootD (
root://
) protocol. (Hint: if you install from source, you may have to setPYTHONPATH
andLD_LIBRARY_PATH
.)
You do not need C++ ROOT to run uproot.
Load a tree from a file:
>>> import uproot
>>> tree = uproot.open("tests/Zmumu.root")["events"]
>>> tree
<TTree 'events' len=2304 at 0x73c8a1191450>
Note that this one-liner would segfault in PyROOT because of a mismatch between ROOT's memory management and Python's. In uproot, there's only one memory manager: Python.
>>> import ROOT
>>> tree = ROOT.TFile("tests/Zmumu.root").Get("events")
>>> tree
Next, get all the data as arrays (if you have enough memory):
>>> for branchname, array in tree.arrays().items():
... print("{}\t{}".format(branchname, array))
...
b'Q2' [-1 1 1 ..., -1 -1 -1]
b'pz1' [-68.96496181 -48.77524654 -48.77524654 ..., -74.53243061 -74.53243061
-74.80837247]
b'Q1' [ 1 -1 -1 ..., 1 1 1]
b'py1' [ 17.4332439 -16.57036233 -16.57036233 ..., 1.19940578 1.19940578
1.2013503 ]
b'E2' [ 60.62187459 82.20186639 81.58277833 ..., 168.78012134 170.58313243
170.58313243]
b'Run' [148031 148031 148031 ..., 148029 148029 148029]
b'eta2' [-1.05139 -1.21769 -1.21769 ..., -1.4827 -1.4827 -1.4827 ]
b'Type' [ 2 71 84 ..., 2 71 71]
b'pt2' [ 38.8311 44.7322 44.7322 ..., 72.8781 72.8781 72.8781]
b'E1' [ 82.20186639 62.34492895 62.34492895 ..., 81.27013558 81.27013558
81.56621735]
b'pz2' [ -47.42698439 -68.96496181 -68.44725519 ..., -152.2350181 -153.84760383
-153.84760383]
b'pt1' [ 44.7322 38.8311 38.8311 ..., 32.3997 32.3997 32.3997]
b'M' [ 82.46269156 83.62620401 83.30846467 ..., 95.96547966 96.49594381
96.65672765]
b'phi2' [-0.440873 2.74126 2.74126 ..., -2.77524 -2.77524 -2.77524 ]
b'px1' [-41.19528764 35.11804977 35.11804977 ..., 32.37749196 32.37749196
32.48539387]
b'px2' [ 34.14443725 -41.19528764 -40.88332344 ..., -68.04191497 -68.79413604
-68.79413604]
b'Event' [10507008 10507008 10507008 ..., 99991333 99991333 99991333]
b'py2' [-16.11952457 17.4332439 17.29929704 ..., -26.10584737 -26.39840043
-26.39840043]
b'eta1' [-1.21769 -1.05139 -1.05139 ..., -1.57044 -1.57044 -1.57044]
b'phi1' [ 2.74126 -0.440873 -0.440873 ..., 0.0370275 0.0370275 0.0370275]
Or just get one or a few arrays:
>>> tree.array("M")
array([ 82.46269156, 83.62620401, 83.30846467, ..., 95.96547966,
96.49594381, 96.65672765])
>>>
>>> tree.arrays(["px1", "py1", "pz1"])
{'px1': array([-41.19528764, 35.11804977, 35.11804977, ..., 32.37749196,
32.37749196, 32.48539387]),
'py1': array([ 17.4332439 , -16.57036233, -16.57036233, ..., 1.19940578,
1.19940578, 1.2013503 ]),
'pz1': array([-68.96496181, -48.77524654, -48.77524654, ..., -74.53243061,
-74.53243061, -74.80837247])}
Or iterate over arrays with a fixed number of entries each:
>>> import numpy
>>> for px1, py1 in tree.iterator(2000, ["px1", "py1"], outputtype=tuple):
... print(numpy.sqrt(px1**2 + py1**2))
...
[ 44.7322 38.8311 38.8311 ..., 73.6439 73.6052 33.716 ]
[ 45.9761 45.9761 45.8338 ..., 32.3997 32.3997 32.5076 ]
...
(Strings and other variable-length data are handled appropriately.)
>>> Find more examples on GitHub Gists! <<<
Normally, you'd think that a library written in Python would be slow. If you're loading gigabytes from a ROOT file, however, most of the time is spent in operating system calls, decompression routines, or Numpy. These external calls do not suffer from the dynamic checks that are performed between each Python byte-instruction and are therefore as fast as compiled code.
So uproot doesn't pay a performance penalty for being written in Python. As it turns out, it's also faster than C++ ROOT because it does much less work.
Jakob Blomer's ACAT 2017 talk evaluates ROOT performance for analysis (and other formats); I repeated his procedure with the same data on an otherwise idle physical machine (techlab-gpu-nvidiak20-04.cern.ch
). This is the uncompressed file (B2HHH.root
), read with a warmed virtual memory cache to emphasize time spent by the library rather than disk-reading or decompression.
The first comparison is time spent opening the file and loading the TTree. This is relevant if you are executing a procedure on a large set of files (TChain). uproot is about 16 times faster.
Time to open file | |
---|---|
C++ ROOT | 0.50 sec |
uproot | 0.03 sec |
Next is the time to read nearly all branches, exactly the same as the test on page 11 of Jakob's talk. uproot is about 5 times faster.
Time to read file | Event rate | Data rate | |
---|---|---|---|
C++ ROOT | 4.62 sec | 1.9 MHz | 230 MB/sec |
uproot | 0.93 sec | 9.2 MHz | 1160 MB/sec |
Finally, we load only one branch (replacing TTree::GetEntry
with TBranch::GetEntry
in C++ ROOT to avoid a performance penalty). uproot is about 4 times faster.
Time to read 1 branch | Event rate | Data rate | |
---|---|---|---|
C++ ROOT | 0.256 sec | 33 MHz | 260 MB/sec |
uproot | 0.064 sec | 133 MHz | 1020 MB/sec |
All of the above tests are single-threaded, and you'd be right to wonder if uproot scales poorly because of Python's Global Interpreter Lock (GIL). However, most external calls release the GIL, allowing parallel threads to be effective.
The tests below were performed on a Knight's Landing processor, which has hundreds of physical cores. The left is the whole file test and the right is the single branch test, which shows some interesting structure that gets smoothed over when all branches use the same thread pool. Threads were pinned to cores, and this file is compressed with lzma, which requires the most CPU time to decompress (to emphasize parallel processing).
C++ ROOT is not on this plot because decompressing baskets in parallel is an upcoming feature.
uproot scales to about 30 threads, which is more cores than many machines have. After that, split your job into multiple processes (Python's multiprocessing
module).
ROOT type | Numpy type | Status |
---|---|---|
single-leaf numeric branches (boolean, signed/unsigned integers, floating point) | corresponding Numpy arrays with big-endian dtype ; if native endian arrays are important to you (e.g. for Numba), cast on the fly by passing an explicit dtype |
done |
single-leaf string branches like TLeafC or TString |
Numpy object array of Python strings (bytes objects); if performance is important and you can handle a uint8 array of raw bytes, pass uint8 as the dtype |
done |
multi-leaf branches (a.k.a. "leaflist") | Numpy record-arrays | TODO! |
fixed-length array of items per entry, declared as myleaf[10] |
Numpy array with a multidimensional shape |
done |
variable-length array of items per entry, declared as myleaf[N] |
Numpy array with no boundaries between entries; use the corresponding counter (N in this example) to distinguish events |
done |
fully-split C++ class objects, stored in the TTree as one branch per class member | one array per class member, following the rules above | done |
TClonesArray of several fully-split C++ class objects per entry |
same as above; use counters (e.g. a leaf named nMuons ) to determine how many objects per event |
done |
unsplit class objects | much too complicated and not useful to view as Numpy; always save your ROOT files with maximum split level | will not do |
std::vector and std::string |
complicated encoding, would be slow to navigate in Python, yet frequently used in analysis | worthwhile? |
If you need to work with split objects or variable-length arrays and don't want to deal with the flattened array format and counters, consider using PLUR, which allows you to write code as though they were Python objects.
- the TODO items in the above table;
- reading a few basic types of non-TTree objects, relelvant for analysis, such as histograms and graphs;
- writing TTrees: flat, no structure, and no reading and writing to the same file;
- import-on-demand connections to Pandas, Keras, TensorFlow, PySpark, etc.
Conversations with Philippe Canal were essential for understanding ROOT I/O. Working with Brian Bockelman on his BulkIO additions to C++ ROOT helped to clarify the distinction between what is I/O and what is user interface. Finally, Sebastien Binet's go-hep provided a clean implementation to pillage imitate, resolving many questions about byte-by-byte interpretations and serialized class versions.