paritytech/polkadot-introspector

PC: prometheus block time metrics are incorrectly reported

Closed this issue · 8 comments

Looking at the histograms, we have too many samples per 30s. Since each parachain can only include blocks once every 12s, we should see 2-3 blocks per 30s on avg. In the below screenshot we see more than 4x.

Screenshot 2023-03-20 at 19 10 59

I have manually checked the returned results and they seem to be quite sane (meaning the relation between count and the real buckets). Is it count shown on the graph or is it sum?

avg(increase(introspector_pc_para_block_time_bucket{chain="$chain", parachain_id="$parachain_id"}[$__rate_interval])) by (le)

@AndreiEres does our deployment subscribe to finalized or best blocks ? We should follow the finalized chain for monitoring.
If it's best block then those extra blocks are explained as parachains build on all forks.

does our deployment subscribe to finalized or best blocks

Yep, it subscribes to finalized blocks.

I also tried to reproduce the bug but didn't manage to. The block time I got is about 12s.

The block time values are ok, but the issue is that there are too many blocks reported -> samples in a bucket

Also, maybe the issue is fixed in latest version ?

Also, maybe the issue is fixed in latest version ?

I'm not sure that we made some relevant changes to accidentally fix it.

Found that the problem was with queries