HTTPArchive/wappalyzer

Defer the technology definitions, categories, and groups to a central repo for all to contribute to and benefit from?

MattMartin1919 opened this issue · 1 comments

Is your feature request related to a problem? Please describe.

This is not really a problem but an idea for the betterment of the old Wappalyzer community.
I have been a long time fan of Wappalyzer and contributed / corrected a few technologies when it was open source. Now that it's private, there are many forks popping up with no universal "source of truth" for the technologies. This is leading to the community being fragmented and not working together towards a common goal.

Describe the solution you'd like

I think it would be beneficial for everyone if the Wappalyzer community can align on a good "source of truth" that can house the "technologies", "categories", and "groups" files. This will allow not only the JS forks to benefit, but also the repos that are ported into other languages.

This is one example repo that is actively being maintained and could be an option: https://github.com/enthec/webappanalyzer

Describe alternatives you've considered

Given the HTTPArchive's reputation, it may actually make sense for this "source of truth" to be under your GitHub organization so people know its trustworthy and never going to be made private like before 🤦

Additional context

TBH, I don't actually care if this happens but I see people actively contributing to forks all over the place so I thought why not direct everyone to a single place to work together.

Hi @MattMartin1919 thanks for raising this. We spent a lot of time thinking about this problem when Wappalyzer shut down their public repo last year and collected our thoughts in this doc. Tracking a community fork was definitely one of the options we considered, but at the time it was unclear who would step up or how reliable it'd be.

Given the HTTPArchive's reputation, it may actually make sense for this "source of truth" to be under your GitHub organization so people know its trustworthy and never going to be made private like before 🤦

Maintaining our own set of detections was never our goal, so if there are reliable community repos out there, we should sync with those. However, it might still be too early to tell how reliable those other repos are. Maybe someone creates a repo but abandons it after they see how much work it is to maintain. Or maybe the quality of detections goes down. Not saying the specific repo you linked is in either of these camps, just that time will tell.

So let's leave this issue open for a few months to let things settle a bit more before we reevaluate.