EmbarkStudios/puffin

Reduce bandwidth with `scope_id`s

Closed this issue · 2 comments

Problem

Currently each profile scope sends an id and a location.

The id is either the function name (for profile_function!()) or a user-specified static string (profile_scope!("calc_normals")).
The location is the file name (or file path and file number, when #165 is merged).

There is also the data field, which is e.g. the mesh name in fn load_mesh(name: &str) { profile_function!(mesh_name); … }. This could change on each invocation, while the id and location does not.

All these fields are send on each invocation of the profiling macro. This can become quite a lot of bytes if the id and/or location is long (and they become longer in #165).

The compressor mitigates this problem, but at the cost of CPU time.

Solution

Let's introduce:

struct ScopeInfo {
    /// Scope name, or function name (previously called "id")
    name: String,

    /// Path to the file containing the profiling macro
    file: String,

    /// The line number containing the profiling
    line_nr: uint
}

/// A unique id for each scope and `ScopeInfo`.
pub struct ScopeId(u32);

The first time a profiling scope is executed it is assigned a unique ScopeId. It sends its ScopeInfo as a special message on the data stream.
After that, and on each subsequent call, it sends only its scope id and, time stamp, and additional data (e.g. mesh name, which can change each invocation).

This means the scope info is only sent once, saving bandwidth on each repeat invocation.
This would also allow us to send more info for each scope, e.g. the full file path instead of just a short version of it.

This requires a stateful receiver which keeps a lookup table of the scopes. If the ScopeId is just an incremental counter, that lookup is as simple as Vec<ScopeInfo> (which will work as long as we're only looking at profiling scopes from one process at a time).

@NiklasNummelin and I've talked a bunch about doing exactly this, think it is the right approach and highly important as the overhead of writing redundant strings for every invocation of a profiler timer really does add up

Closed by #169