Define a global entropy measurement for strings and literals

Question

Define a global entropy measurement for strings and literals

maxfisher-g opened this issue a year ago · 1 comments

Entropy calculations currently use per-file character frequency counts to define the expected probabilities for each character. It would be better to measure character frequencies on a large dataset of source files and then use the same frequency counts to analyse all packages.

Answer 1 · 2023-10-24T03:18:55.000Z

It will be easier to measure character frequencies when we have static analysis data in bigquery