LTTng trace types, what can we support out of the box?

Question

LTTng trace types, what can we support out of the box?

MatthewKhouzam opened this issue 2 months ago · 8 comments

LTTng Kernel traces are great! we can have a lot of information and can make many views.

LTTng UST has a few wrappers that help. We can have memory usage and callstacks.

Are there any default fields in the java agent, java.util.logging, log4j, python or LTTng logger that can can be visualized out of the box?

Here is an example of what we can get with log4j for instance:

vpid
procname
ip
vtid
msg (another event inside)
logger_name
method_name
filename
line_number
timestamp (second)
thread_name
loglevel

vpid = 328, procname = "spark-listener-", ip = 0x7B48B92AE8D2, vtid = 466 }, { msg = "eventType=JobEnd; jobId=0; jobResult=JobSucceeded; endTime=1714058265331", logger_name = "org.apache.spark.examples.MyCustomSparkListener", class_name = "org.apache.spark.examples.MyCustomSparkListener", method_name = "logData", filename = "MyCustomSparkListener.java", line_number = 280, timestamp = 1714058265331, int_loglevel = 20000, thread_name = "spark-listener-group-shared"

What views can be made with this partial info?

Answer 1 · 2024-07-29T17:45:46.000Z

Are there any default fields in the java agent, java.util.logging, log4j, python or LTTng logger that can can be visualized out of the box?

What views can be made with this partial info?

I think it would be better to ask "what is useful?" rather than asking "what is possible?".

More specifically, to identify which views would be helpful, my reflex would be to start with identifying whether there are problems real Trace Compass users troubleshoot/analyze that would be facilitated by having some of this additional data. If needed, I could assist in reaching out to folks if you have contacts.

It seems risky to start with the data and directly think about ways to visualize it without grounding them in real use cases. The main risk being that the views are not used in practice and that the development work does not have an impact.

Answer 2 · 2024-07-30T12:39:33.000Z

Ok, I agree with the sentiment. Now I would ask, what was the logic of adding the fields in the trace points. I see the trace points are there and I assume there's logic in having them there.

Answer 3 · 2024-07-30T13:53:37.000Z

what was the logic of adding the fields in the trace points. I see the trace points are there and I assume there's logic in having them there.

That's a good point! I don't know who did the instrumentation off the top of my head, but I'll briefly ask around to see if there was a real user-based rationale.

We're working on making those explicit user connections on the instrumentation side as well.

Answer 4 · 2024-07-30T14:14:13.000Z

Perfect. So my reasoning was the tracepoints/fields are there for a reason, why not finish the loop.

Answer 5 · 2024-07-30T14:54:46.000Z

why not finish the loop

I would argue that without additional user context, any views added risk not actually closing the loop i.e. not being helpful/used.

I did confirm that the instrumentation was created in response to a user request (a while ago). That being said, we don't have knowledge of which fields are used in analyses, what information they extract, and how. There are so many possibilities for what information to visualize and how and I feel the risk is high that we choose wrong.

I think it's worth connecting with users to answer questions like the following:

How do you use the available log4j information to solve problems? Is it correlated with other data sources? What symptoms do you look for?
Confirm whether having this information available via a graphical tool would facilitate their work. For example, if they already have a customized automated analysis, a graphical representation may not necessarily integrate into their existing workflow.

Answer 6 · 2024-07-30T19:29:03.000Z

Right now, we are looking into bringing in a spark trace type based on @Rezix93's great work. This would extend UST and be the first FOSS example of a log4j view. We wondered if there's any other needs at the moment. Reza's work is more aimed at solving general purpose cloud utilization, not just Apache Spark issues.

Answer 7 · 2024-07-30T19:36:37.000Z

we are looking into bringing in a spark trace type

Ok.

We wondered if there's any other needs at the moment.

We're not aware of any particular needs/gaps. (I'm sure they exist, but we're not in close communication with folks who use these data sources at the moment.)

Answer 8 · 2024-07-30T19:38:17.000Z

I am leaving this issue open in case any needs arise. This was the first time we opened a log4j ust trace and the extra fields were a surprise.