awslabs/duvet

Report generation is not deterministic

Closed this issue · 5 comments

eagr commented

The motivation is to avoid unnecessary report uploads/publishing.

Right now, h3 publishes a compliance report on every push to the main branch, by pushing to the gh-pages branch. Most of the time, the reports are identical.

Can we somehow get the data that is fed into the reporter, in the way that it is deterministically serialized? Then we would be able to do something like hashing the data and comparing the hashes to determine whether to proceed with uploading/publishing.

I believe report generation is deterministic... If not it should be 😁. And since you're publishing to GH pages it should already be hashing contents since it's using git.

Ah looks like there's a hashmap in there

pub targets: HashMap<&'a Target, TargetReport<'a>>,

Should be easy to replace with BTreeMap

eagr commented

I kinda had assumed that the overall output could not be deterministic, as there can be many moving pieces in the JS land, class name generator, transpiler, minifier, which may not promise to give deterministic output.

Idk, maybe it's a false assumption.

The js is statically built and published with the crate currently so it'll also be the same between runs of the same duvet version.

eagr commented

So when we run duvet report, it basically just inserts serialized data into some <script> tag in a html file? Nice, that'll work.