Cache the creation of RecordDescriptors in `extend_record()`
yunzheng opened this issue · 2 comments
yunzheng commented
The method extend_record()
generates a new RecordDescriptor every time, which is an expensive task. The creation of the RecordDescriptors should be cached.
This will directly improve the speed where extend_record()
is used, such as the --multi-timestamp
option in rdump
.
Zawadidone commented
@yunzheng wow this makes rdump --multi-timestamp
very fast!
Before this issue was fixed it took roughly 17 minutes #46 (comment), but now 3 minutes.
time find export/plugins -type f -print0 | xargs -r0I {} -P 14 sh -c 'rdump {} --multi-timestamp -w jsonfile://export/$(basename {} .jsonl).jsonl?descriptors=True'
real 2m55.554s
[...]
yunzheng commented
@Zawadidone wow, nice gainz! :)
Thanks for benchmarking!