Experiment with Mutation Summary library
gunesacar opened this issue · 9 comments
Mutation Summary library claims to make DOM monitoring easier and more efficient. We decided to give it a try to see if it'll be useful.
Short video explaining how Mutation Summary works.
We decided to integrate this library into OpenWPM and test on some product pages with social proof messages.
I created a few Javascript examples to help determine how useful Mutation Summary might be in our case.
I integrated MutationSummary into OpenWPM and started processing mutation summary events:
https://github.com/gunesacar/OpenWPM/blob/661439178861217c398bd31aa07fd3ce55103508/automation/Extension/firefox/data/content.js#L2048-L2105
For the moment I prepended the mutation library code to content.js
. I could not immediately use the library from content.js when I add it separately - perhaps due to timing issues.
Here are the logs for the slightly modified version of the jsbin you shared:
https://gist.github.com/gunesacar/f5f9b43670bb63dd4fd8c94639116041
I log node type, text content (and wholeText), ID assigned by the library and old value of the attributes and text content when available. See the console.log
calls in content.js for the details.
I also included the logs from loading princeton.edu homepage, gives an idea about what we should filter.
I started filtering summaries by element type and extract relevant info including visibility and computed style:
https://github.com/gunesacar/OpenWPM/blob/027c460ddc98cb0eee9701070dcebe71faa0ef5d/automation/Extension/firefox/data/content.js#L2099
I used this JSfiddle for development: http://jsfiddle.net/xdqncyrp/76/
Feel free to fork.
Updated logs I got by loading princeton.edu homepage:
https://gist.github.com/gunesacar/b13f23769e75b62896fab5e43f978f4e
@aruneshmathur do we have examples using CSS Transitions? I wonder how those will show up in the summaries.
We don't currently but I can create some. I'm not hopeful we'll see them through Mutation Summary unless they are injected via JS into the page.
It appears that Mutation Summary library pretty reliably detects transient elements and works well within OpenWPM as a content script.
The challenge we have is to figure out how to integrate mutations into the data collection pipeline, esp. segmenting and clustering. For instances, would we re-segment every time a new element is added to, or removed from the DOM. Elements with dynamically changing text (e.g. countdown timers) presents another challenge: how to represent features when they change rapidly for some elements, and do not change at all for others.
To focus on setting up the data collection and processing pipeline we've decided to ignore transient events for the time being.
The code lives here: https://github.com/gunesacar/OpenWPM/blob/mutation_summary/automation/Extension/firefox/data/content.js