/obs-opentelemetry

An extension library of OpenTelemetry for Kitex

Primary LanguageGoApache License 2.0Apache-2.0

opentelemetry (This is a community driven project)

English | 中文

Opentelemetry for Kitex

OpenTelemetry is an open source observability framework from CNCF that consist of a series of tools, APIs and SDKs, and it enables IT teams to detect, generate, collect, and export remote monitoring data for analysis and understanding of software performance and behavior.

The obs-opentelemetry extension is available in the kitex-contrib, which allows kitex to integrate OpenTelemetry with a simple setup.

Feature

Provider

  • Out-of-the-box default opentelemetry provider
  • Support setting via environment variables

Instrumentation

Tracing

  • Support server and client kitex rpc tracing
  • Support automatic transparent transmission of peer service through meta info

Metrics

  • Support kitex rpc metrics [R.E.D]
  • Support service topology map metrics [Service Topology Map]
  • Support go runtime metrics

Logging

  • Extend kitex logger based on logrus and zap
  • Implement tracing auto associated logs

Configuration via environment variables

Server usage

import (
    ...
    "github.com/kitex-contrib/obs-opentelemetry/provider"
    "github.com/kitex-contrib/obs-opentelemetry/tracing"
)


func main()  {
    serviceName := "echo"
	
    p := provider.NewOpenTelemetryProvider(
        provider.WithServiceName(serviceName),
        provider.WithExportEndpoint("localhost:4317"),
        provider.WithInsecure(),
    )
    defer p.Shutdown(context.Background())

    svr := echo.NewServer(
        new(EchoImpl),
        server.WithSuite(tracing.NewServerSuite()),
        // Please keep the same as provider.WithServiceName
        server.WithServerBasicInfo(&rpcinfo.EndpointBasicInfo{ServiceName: serviceName}),
    )
    if err := svr.Run(); err != nil {
        klog.Fatalf("server stopped with error:", err)
    } 	
}

Client usage

import (
    ...
    "github.com/kitex-contrib/obs-opentelemetry/provider"
    "github.com/kitex-contrib/obs-opentelemetry/tracing"
)

func main(){
    serviceName := "echo-client"
	
    p := provider.NewOpenTelemetryProvider(
        provider.WithServiceName(serviceName),
        provider.WithExportEndpoint("localhost:4317"),
        provider.WithInsecure(),
    )
    defer p.Shutdown(context.Background())
    
    c, err := echo.NewClient(
        "echo",
        client.WithSuite(tracing.NewClientSuite()),
        // Please keep the same as provider.WithServiceName
        client.WithClientBasicInfo(&rpcinfo.EndpointBasicInfo{ServiceName: serviceName}),
    )
    if err != nil {
        klog.Fatal(err)
    }
	
}

Tracing associated Logs

set logger impl

import (
    kitexlogrus "github.com/kitex-contrib/obs-opentelemetry/logging/logrus"
)

func init()  {
    klog.SetLogger(kitexlogrus.NewLogger())
    klog.SetLevel(klog.LevelDebug)

}

log with context

// Echo implements the Echo interface.
func (s *EchoImpl) Echo(ctx context.Context, req *api.Request) (resp *api.Response, err error) {
	klog.CtxDebugf(ctx, "echo called: %s", req.GetMessage())
	return &api.Response{Message: req.Message}, nil
}

view log

{"level":"debug","msg":"echo called: my request","span_id":"056e0cf9a8b2cec3","time":"2022-03-09T02:47:28+08:00","trace_flags":"01","trace_id":"33bdd3c81c9eb6cbc0fbb59c57ce088b"}

Example

Executable Example

Supported Metrics

RPC Metrics

Kitex Server

Below is a table of RPC server metric instruments.

Name Instrument Unit Unit (UCUM) Description Status Streaming
rpc.server.duration Histogram milliseconds ms measures duration of inbound RPC Recommended N/A. While streaming RPCs may record this metric as start-of-batch to end-of-batch, it's hard to interpret in practice.

Kitex Client

Below is a table of RPC client metric instruments. These apply to traditional RPC usage, not streaming RPCs.

Name Instrument Unit Unit (UCUM) Description Status Streaming
rpc.client.duration Histogram milliseconds ms measures duration of outbound RPC Recommended N/A. While streaming RPCs may record this metric as start-of-batch to end-of-batch, it's hard to interpret in practice.

R.E.D

The RED Method defines the three key metrics you should measure for every microservice in your architecture. We can calculate RED based on rpc.server.duration.

Rate

the number of requests, per second, you services are serving.

eg: QPS

sum(rate(rpc_server_duration_count{}[5m])) by (service_name, rpc_method)

Errors

the number of failed requests per second.

eg: Error ratio

sum(rate(rpc_server_duration_count{status_code="Error"}[5m])) by (service_name, rpc_method) / sum(rate(rpc_server_duration_count{}[5m])) by (service_name, rpc_method)

Duration

distributions of the amount of time each request takes

eg: P99 Latency

histogram_quantile(0.99, sum(rate(rpc_server_duration_bucket{}[5m])) by (le, service_name, rpc_method))

Service Topology Map

The rpc.server.duration will record the peer service and the current service dimension. Based on this dimension, we can aggregate the service topology map

sum(rate(rpc_server_duration_count{}[5m])) by (service_name, peer_service)

Runtime Metrics

Name Instrument Unit Unit (UCUM)) Description
process.runtime.go.cgo.calls Sum - - Number of cgo calls made by the current process.
process.runtime.go.gc.count Sum - - Number of completed garbage collection cycles.
process.runtime.go.gc.pause_ns Histogram nanosecond ns Amount of nanoseconds in GC stop-the-world pauses.
process.runtime.go.gc.pause_total_ns Histogram nanosecond ns Cumulative nanoseconds in GC stop-the-world pauses since the program started.
process.runtime.go.goroutines Gauge - - measures duration of outbound RPC.
process.runtime.go.lookups Sum - - Number of pointer lookups performed by the runtime.
process.runtime.go.mem.heap_alloc Gauge bytes bytes Bytes of allocated heap objects.
process.runtime.go.mem.heap_idle Gauge bytes bytes Bytes in idle (unused) spans.
process.runtime.go.mem.heap_inuse Gauge bytes bytes Bytes in in-use spans.
process.runtime.go.mem.heap_objects Gauge - - Number of allocated heap objects.
process.runtime.go.mem.live_objects Gauge - - Number of live objects is the number of cumulative Mallocs - Frees.
process.runtime.go.mem.heap_released Gauge bytes bytes Bytes of idle spans whose physical memory has been returned to the OS.
process.runtime.go.mem.heap_sys Gauge bytes bytes Bytes of idle spans whose physical memory has been returned to the OS.
runtime.uptime Sum ms ms Milliseconds since application was initialized.

Compatibility

The sdk of OpenTelemetry is fully compatible with 1.X opentelemetry-go. see

maintained by: CoderPoet

Dependencies

Library/Framework Versions Notes
go.opentelemetry.io/otel v1.19.0
go.opentelemetry.io/otel/trace v1.19.0
go.opentelemetry.io/otel/metric v1.19.0
go.opentelemetry.io/contrib/instrumentation/runtime v0.45.0
kitex v0.7.3