KirillOsenkov/MSBuildStructuredLog

Diff two binlogs

KirillOsenkov opened this issue · 18 comments

Diff two binlogs

I actually am in need of this functionality which is why I forked the project! What direction were you thinking of going to support this?

For my use case, it would be cool if both log trees were shown with filters. Filters could maybe behave similar to diffing two folders in Araxis (All, Diff). When diff is selected, only the msbuild targets that differed are shown in each tree, and when the node is selected/double clicked a standard diff window is shown with green for additions and red for missing on each side (not sure if a new window is best, or if using the standard tabbed details view is good - I really like that one, but screen real-estate could be a problem).

I'm happy to help in any way I can :)

Yeah, it's tricky to get right, I didn't think about it yet.

For now you have two quick workarounds: use the Save as XML feature to get two .xml files for logs, and diff those using your favorite XML diffing tool (some of them are really good).

Or you can right-click on any node and say Copy subtree, which will copy the text contents of the subtree to clipboard. You can diff those as well.

As for my plan, I was thinking of a new menu items - "Diff 2 .binlogs", where you select two .binlog files and it opens a single merged tree, where nodes are as follows:

  • node that exactly matches in both (including children) is semi-transparent with Opacity 30% (so it's easy to ignore things that haven't changed)
  • node that matches but has differences underneath is normal
  • node that is only on the left is red
  • node that is only on the right is green

I would also add heuristics to ignore non-important changes. For instance, for all nodes where the order of children doesn't matter, children will be sorted before diffing.

That will be a good start. I hope you are working on it already but I can help with it if you need any.

I'm waiting for Visual Studio 16.6 to ship. MSBuild there has a really important new feature that allows to parent a project under the MSBuild task that started it. This will greatly improve the diffing experience. After that is available I'll start on the feature.

That is good to know. I was wondering why it is not already doing that. Do you have a link I can check the status if that change?

It is already in MSBuild master and will be available in Visual Studio 16.6. I believe it is already available if you install a Preview of VisualStudio 16.6.

Here's the PR where this was fixed: dotnet/msbuild#5013

One can use this to diff two lists: https://github.com/praeclarum/ListDiff

Unfortunately there's still this bug that would make diffs a bit complicated:
dotnet/msbuild#5473

I would absolutely love this feature for comparing changes between two builds. In particular, the ability to compare specific targets with path normalization (if I built the project in one repo and then built it in another repo on a different commit).

@KirillOsenkov Did you ever try this to get a diff and did it work? https://github.com/praeclarum/ListDiff

Yes I’m using ListDiff for comparing flat sequences and it’s great. However the complication is elsewhere when diffing binlogs - we need to unify to remove a lot of noise to highlight the signal.

I've noticed there's some in-progress prototype work in the branches of this fork:
https://github.com/pchaurasia14/MSBuildStructuredLog/branches

Also here's a helper I have for diffing sorted sequences in O(n) time and no memory:
https://gist.github.com/KirillOsenkov/72dc4b0b0a864c461b8c54dee61846d5

It would also be nice to be able to diff two subtrees in the same binlog, e.g., when a project is built twice.

you can right-click any node, either Copy Children or Copy Subtree and then diff two texts, as a workaround

I do understand this is suboptimal

@KirillOsenkov, do you still consider it as a valuable piece of work?
I find myself more often struggling without this feature :)

yes it is much needed but also super hard, I've been mustering courage for it