O[pinionate]d Observability Extensions for .NET.
In order to make use of the Observability libraries you need to initialize the Observability Host. Currently, only ASP.NET Core hosts are supported.
Add the O9d.Observability.Hosting.AspNet package from NuGet
dotnet add package O9d.Observability.Hosting.AspNet
You can then update your Startup.cs
file to initialize the host:
services.AddObservability(builder =>
{
});
Internally this will initialize an ASP.NET Core Hosted Service that keeps track of all registered instrumentation components.
To start instrumenting your application you need to add one of the relevant instrumentation packages (discussed in more detail below), for example, to add ASP.NET Core metrics (using Prometheus), add the 09d.AspNet package:
dotnet add package O9d.Metrics.AspNet
Then update your Observability startup code:
services.AddObservability(builder =>
{
builder.AddAspNetMetrics(options => {});
});
One of the design goals of this library is that it should be as unobtrusive as possible, leveraging the built-in diagnostic and activity components of the Core CLR so that adding instrumentation doesn't interfere with other application code or middleware.
The 09d.Metrics.AspNet package adds specific Prometheus metrics that we have found to be the most useful when operationalising HTTP services in production.
After installing the Observability Hosting and ASP.NET Metrics Packages to your application, update your Startup.cs
as follows:
services.AddObservability(builder =>
{
builder.AddAspNetMetrics(options => {});
});
By default the library adds the following Prometheus metrics:
A histogram (default) or summary that tracks the duration in seconds that HTTP requests take to process.
Labels:
Name | Description | Example |
---|---|---|
operation |
A descriptor for the operation and endpoint that was requested | get_customers |
status_code |
The status code returned by your service | 200 |
A gauge that tracks the number of requests in progress.
Labels:
Name | Description | Example |
---|---|---|
operation |
A descriptor for the operation and endpoint that was requested | get_customers |
A counter that tracks the number of HTTP requests resulting in an error.
Labels:
Name | Description | Example |
---|---|---|
operation |
A descriptor for the operation and endpoint that was requested | get_customers |
sli_error_type |
The service level indicator error type | external_dependency |
sli_dependency |
For dependency error types, the name of the causing dependency | skynet |
With these metrics we can easily calculate both internal and external service availability. To calculate our client facing availability:
Availability = successful_requests / (total_requests - client_failures)
For example:
Given 100 requests
of which
70 returned HTTP 200
10 returned HTTP 500 (Server Error)
20 returned HTTP 422 (Invalid Client Request)
Availability = (100 - 30) / (100 - 20)
= 87.5%
To calculate this in Prometheus/Grafana:
(sum(rate(http_server_request_duration_seconds_count[10m])) - sum(rate(http_server_errors_total[10m]) OR on() vector(0))) /
(
sum(rate(http_server_request_duration_seconds_count[10m])) -
sum(rate(http_server_errors_total{sli_error_type="invalid_request"}[10m]) OR on() vector(0))
)
The default Prometheus libraries for ASP.NET are quite verbose and can result in a large number of series or high-cardinality labels.
By design this library only tracks genuine endpoints of your application since generally, metrics about non-existent endpoints offer little value (e.g. bots trying to hit /phpmyadmin
). Note that a metric for unmatched paths is something we're thinking about.
By default the library uses the following approach to resolve the operation name
- The name of the route if set on your controller action, for example:
c#
[HttpGet("status/{code:int}", Name = "get_status")]
- Or, use a combination of the HTTP verb and route template e.g.
PUT /customers/{id}
In general we recommend explicitly naming your route to avoid your metrics changing if your URI structure is updated.
By default the following status codes are determined to be an error:
400 - 499
- Error Type: Invalid Request>500
- Error Type: Internal
What we can't track automatically are errors that are the result of internal or external dependencies. For these you have two options:
-
Set the SLI error using
HttpContext.SetSliError()
, for example:HttpContext.SetSliError(ErrorType.ExternalDependency, "skynet");
-
Throw an
SliException
(or any derived type), for example:throw new SliException(ErrorType.ExternalDependency, "skynet");
The AspNetMetricsOptions
class includes a number of options to customize the metrics created by the library. Each metric listed above has an associated ConfigureX
property that can be used to customize the underlying Prometheus metric configuration. For example, to set the buckets used by the Request Duration Histogram metric:
services.AddObservability(builder =>
builder.AddAspNetMetrics(options =>
options.ConfigureRequestDurationHistogram = histogram =>
{
histogram.Buckets = new[] { 0.1, 0.2, 0.5, 0.75, 1, 2 };
}
)
);
Using a summary instead of a histogram to to track request duration
We recommend using histograms (the default) if you are running multiple instances of your application since they can be aggregated. If you are happy with the trade-offs of using Summary metrics, you can switch the request duration metric type like so:
services.AddObservability(builder =>
builder.AddAspNetMetrics(options =>
options.RequestDurationMetricType = ObserverMetricType.Summary
)
);
We've created a Grafana Dashboard that leverages the metrics generated by O9d.Metrics.AspNet. You can see this in action by running the examples and install it from Grafana Labs.
For the dashboard to work you should add an app
label with the name of your application. This can be done by your agent or directly within your application using static labels:
Prometheus.Metrics.DefaultRegistry.SetStaticLabels(new Dictionary<string, string>
{
{ "app", "aspnet-example" },
{ "env", "prod" }
});
This project was heavily inspired by the Open Telemetry Libraries for .NET.
We wanted to make it easy to plug in additional instrumentation without a lot of ceremony. Suppose you want to instrument operations in the DazzleDB .NET client. Fortunately the client already emits events to a Diagnostic Source and the Observability library makes it easy to tap into them.
Create a class that implements IObserver<KeyValuePair<string, object?>>
to receive Diagnostic Listener events:
internal class DazzleDbMetricsObserver : IObserver<KeyValuePair<string, object?>>
{
}
dotnet add package O9d.Observability
public static class DazzleDbObservabilityBuilderExtensions
{
public static IObservabilityBuilder AddDazzleDbMetrics(this IObservabilityBuilder builder)
{
if (builder is null) throw new ArgumentNullException(nameof(builder));
return builder.AddDiagnosticSource("DazzleDb", new DazzleDbMetricsObserver());
}
}
The above code makes use of the AddDiagnosticSource
extension to handle the boilerplate DiagnosticSource
subscription logic and ensure subscribers are tracked.
services.AddObservability(builder =>
{
builder.AddDazzleDbMetrics();
});