wingman-jr-addon/wingman_jr

Failure to block images on Telegram web

TheCranky opened this issue · 5 comments

Wingman Jr. fails to block images sent through telegram web.

@TheCranky I'll have to look into this further. The first thing I have to do when it is not blocking is to determine if it's failure of the AI or a failure of the scanning mechanism - e.g. it isn't even scanned. Can you provide some more specific details around what you tried to do and how it failed? I myself haven't used Telegram so I'm not terribly familiar with what you'd like me to be checking for. Thanks!

Hi, I believe the extension is outright failing to scan media from web.telegram.org because the extension does not indicate scanning progress nor does it switch zones in automatic mode. This is likely due to a unique feature of the website (perhaps for security) and not the wingman itself as the functionality of some other extensions tend to break on it as well. However, I believe it should still be possible for wingman to be coded to function properly as there are telegram tampermonkey userscripts which work on it (see https://greasyfork.org/en/scripts/379355-telegram-ad-filter as an example) There is also another script that fails to work (https://greasyfork.org/en/scripts/420808-hide-globalsearch) The main difference I found is that one activates on https://web.telegram.org/ which does not work while the other activates on https://web.telegram.org/k/* which does work. Hopefully this helps :)

@TheCranky Yes, that does help.
As a reference point, I'll explain how the addon works briefly and how that interacted in another case, Google Images.

The addon works - for the most part - by hooking the HTTP/S request/response streams for media types using webRequest. This work fairly well, but there are some holes. For example, Google Images seemed to work fairly well as you scrolled through results, but the first results - which were often quite bad - seemed to slip through. It took a while but after some investigation I realized that they were coming through as base64 encoded images, which were not getting scanned by webRequest. So I needed to carve out a path where HTML was scanned for base64 images (with a little extra magic).

So, in the case of Telegram, it could be something like that, or it could be that the actual delivery of the images themselves is encrypted. I don't have any fancy domain/URL filtering (other than the optional Cloudflare DNS-based blocking), so it is likely something in the delivery mechanism.
As a side note, I had considered the "look at the img element as it gets put in the DOM", like nsfw-filter now does, but there are some big tradeoffs there that I think they haven't solved yet. So it'll be interesting to see if we find an edge case here.

@TheCranky I had a chance to look at this a bit, and you're right - Wingman isn't scanning at all.
When I looked at the network monitor, Telegram seems to work by communicating everything encrypted over websocket connections - great for privacy, less great for Wingman.
Unfortunately as noted in the last response, in order to be able to catch these, I'd have to do a decent bit of rearchitecture - basically I'd have to start working with content scripts and handle things with MutationObserver, then ship filtering off to the processing scripts. I wish that webRequest.filterResponseData also supported data: URL's, as that would make this and e.g. Google Images seamless, but they do not even though I've asked.

Maybe I'll think of some other strategy, but for now I'm probably going to leave this as unsupported. 😞

Closing for now.