PC: prometheus block time metrics are incorrectly reported

Question

PC: prometheus block time metrics are incorrectly reported

Closed this issue a year ago · 8 comments

Looking at the histograms, we have too many samples per 30s. Since each parachain can only include blocks once every 12s, we should see 2-3 blocks per 30s on avg. In the below screenshot we see more than 4x.

Answer 1 · 2023-03-29T11:59:50.000Z

I have manually checked the returned results and they seem to be quite sane (meaning the relation between count and the real buckets). Is it count shown on the graph or is it sum?

Answer 2 · 2023-03-29T12:22:26.000Z

avg(increase(introspector_pc_para_block_time_bucket{chain="$chain", parachain_id="$parachain_id"}[$__rate_interval])) by (le)

Answer 3 · 2023-03-29T17:50:51.000Z

@AndreiEres does our deployment subscribe to finalized or best blocks ? We should follow the finalized chain for monitoring.
If it's best block then those extra blocks are explained as parachains build on all forks.

Answer 4 · 2023-03-30T07:01:44.000Z

does our deployment subscribe to finalized or best blocks

Yep, it subscribes to finalized blocks.

I also tried to reproduce the bug but didn't manage to. The block time I got is about 12s.

Answer 5 · 2023-03-30T07:17:28.000Z

The block time values are ok, but the issue is that there are too many blocks reported -> samples in a bucket

Answer 6 · 2023-03-30T08:21:23.000Z

Also, maybe the issue is fixed in latest version ?

Answer 7 · 2023-03-31T10:43:50.000Z

Also, maybe the issue is fixed in latest version ?

I'm not sure that we made some relevant changes to accidentally fix it.

Answer 8 · 2023-05-11T13:53:06.000Z

Found that the problem was with queries