Record Caliper configuration in output

Question

Record Caliper configuration in output

DavidPoliakoff opened this issue 7 years ago · 4 comments

I'm writing processing pipelines for Caliper profiles. One of the tricky things is that a lot of my current processing is based on serial-trace configured profiles, which is quite possibly wrong. One thing I don't know how to know is what the configuration of the Caliper profiles I'm looking at were. It would be nice if there was a way to see this in the profiles. Specifically, pipelines will be very curious whether the profile they're examining was a trace or a profile, as this will impact whether we should do aggregation in cases like this

Where we have trace entries which are essentially "intermediate" ones created as metric Annotations are started and stopped and set.

Answer 1 · 2017-11-13T19:02:51.000Z

Alternatively, a system which looks at the distributions of entries and NaN's across different metric combinations and learns the likely configurations based on characteristics of the produced dataframes would also be really cool, but probably a bad idea.

Answer 2 · 2017-11-13T22:50:37.000Z

Hm, would additional attribute metadata that would e.g. tell if some attribute is an aggregate one help? Overall I'd like to see some more specific use cases here to see which information would be useful. In the example above it seems that a WHERE run_size clause would help to filter out the unneeded records.

Answer 3 · 2017-11-14T01:20:25.000Z

On specifics: that wouldn't help in every instance, if I have two different metrics being collected you'd have one with a NaN in time.inclusive.duration, one with a NaN in that other metric, each with run_size defined. The problem becomes steadily less tractable as more metrics and Annotations get added.

"Is an aggregated attribute" metadata would certainly help, but it seems to me that it would require some fairly active checking (though think it's an excellent idea regardless of the other path). I also like the idea of being able to "select config_details" *.cali (conceptually) and see that all were run with the same configuration.

I don't know, I'd be okay with aggregatable attributes (heck, I probably could make reasonable inferences from matrix structure, the more I think about that joke the better the idea seems) but wanted you thinking about this option. I'll think on it as well.

Answer 4 · 2019-08-12T14:55:35.000Z

I don't think we're/you're continuing the Caliper/Jupyter pipelines in quite the way I was doing them then, closing