facebookarchive/profilo

Can i ask for the meaning of THREAD_CPU_TIME in Syscounters?

Quang0309 opened this issue ยท 11 comments

The thread cpu time is consumed in ms:

void threadCountersCallback(
  uint32_t tid,
  util::ThreadStatInfo& prevInfo,
  util::ThreadStatInfo& currInfo) {

  if (prevInfo.highPrecisionCpuTimeMs != 0 &&
      (currInfo.availableStatsMask & StatType::HIGH_PRECISION_CPU_TIME)) {
    // Don't log the initial value
    logCPUTimeCounter(
        prevInfo.highPrecisionCpuTimeMs, currInfo.highPrecisionCpuTimeMs, tid);
  } else if (
      prevInfo.cpuTimeMs != 0 &&
      (currInfo.availableStatsMask & StatType::CPU_TIME)) {
    // Don't log the initial value
    logCPUTimeCounter(prevInfo.cpuTimeMs, currInfo.cpuTimeMs, tid);
  }
...

inline void logCPUTimeCounter(long prevCpuTime, long currCpuTime, int tid) {
  const auto kThresholdMs = 1;
  logMonotonicCounter(
    prevCpuTime,
    currCpuTime,
    tid,
    QuickLogConstants::THREAD_CPU_TIME,
    kThresholdMs);
}

When i used it, i see that it does not only provide the cpu time used in the thread that i call method Start() only but it provides the cpu time used in the whole application, am i right?

I guess it depends on what you mean by "when I used it".

THREAD_CPU_TIME logs the time spent along with the thread id (tid), and it is the per-thread time. PROC_CPU_TIME is the time spent by the whole process.

If by "using it" you mean something like modifying test_counters from the workflow_demo.py script to include THREAD_CPU_TIME, then you're aggregating the time from all the threads, which would be the time spent by the whole application. This is because the current version of workflow_demo.py just aggregates on counterType, which basically ignores the fact that some counters are per-thread.

You could somewhat easily modify the script to make it work for you, though. The tid lives inside the unit object, in unit.properties.customProps['tid']. You could modify the for unit in trace.executionUnits.values(): loop inside the syscounters() function to aggregate not only on counterType, but on (counterType, tid) tuples, and that would give you the per-thread granularity that you're looking for.

Yes i modified the test_counters to include THREAD_CPU_TIME, but how can i know the thread id of the thread i am tracing?

I'm not sure I fully understand the question, but here goes.

You could do something like this:

    for unit in trace.executionUnits.values():
        curr_tid = unit.properties.customProps['tid']
        for block_id in unit.blocks:
            block = trace.blocks[block_id]
            for point in block.points:
                counter = point.properties.counterProps.get(tt.CounterUnit.ITEMS, None)
                if not counter:
                    continue
                for counterType, count in counter.items():
                    counters[(curr_tid, counterType)].append((point.timestamp, count))

Your counters map will now be indexed by (tid, counterType) tuples, essentially giving you the per-thread granularity.

Is this what you were asking?

My question is that how can i know which threads in the application correspond to which tid in the trace.

Ah, I think I see what you mean.

The threadmetadata provider, which is enabled by default on the sample app, logs TRACE_THREAD_NAME entries, which should give us the information that we need. However, the current version of the parser does not handle these entries, so all that we have is the tid. I'll start working on a fix for this and let you know when it's ready =)

Commit d8e21bb addresses this problem.

Now the thread's name is available under unit.properties.coreProps['name']. We also added the thread's priority under unit.properties.coreProps['priority'].

Please close this issue if you think it addresses your original concern.

thank you for your quick response!!

Just test it and now i can get the thread's name instead of thread id, but i notice that profilo can only give me the name i set in the thread's constructor if the thread is still running when stopTrace() is called, if the thread dies before calling the stopTrace(), profilo give me something like Thread_3443.

The mechanism we use to find a thread's name consists on reading /proc/self/task/<tid>/comm. The /proc/self/task directory only contains entries for threads that are currently alive, so as soon as a thread dies there's no way for us to get its name.

Now, this information is obtained by Profilo by means of the threadmetadata provider. This provider is called only when a trace stops, and the reason for this is that the process mentioned above is expensive (and there might be a pretty large number of threads running), so we don't want to do it too often. So, the "tradeoff" here is not hitting performance - by calling this provider only at the end of a trace - at the cost of not having the name of every single thread that was spawned.

However, you can change this behavior if you decide that you're OK with the performance hit. For instance, you can spawn a thread that calls this mechanism in some sort of loop (see https://github.com/facebookincubator/profilo/blob/master/java/main/com/facebook/profilo/provider/stacktrace/StackFrameThread.java#L143-L174 and https://github.com/facebookincubator/profilo/blob/master/cpp/profiler/SamplingProfiler.cpp#L404-L438 for a way to do this) so that you get the names of all threads.