This is a Ruby port of Wappalyzer, which identifies technologies on websites. It detects content management systems, ecommerce platforms, JavaScript frameworks, analytics tools and much more.
This port does provide some additional options, and nicer parsing of technologies listed by Wappalyzer.
I found a few ports in Ruby, but Wappalyzer now uses Chrome browser to test technologies, and most of these ports were not using this browser for tests. With this port, I am using Ferrum for tests, which is really really fast.
Futhermore, most ports cached a copy of technologies.json
provided by
Wappalyzer, which becomes outdated after a while. With this port, this json
will be downloaded and cached for the user (and can be updated on demand).
Add this line to your application's Gemfile:
gem 'wappalyzer', github: 'nikhgupta/wappalyzer', branch: 'master'
And then execute:
bundle install
To analyze a website, use:
Wappalyzer.analyze "https://codewithsense.com/" # a web page
Wappalyzer.analyze "codewithsense.com" # or, just the domain
Note that, technologies data is cached in memory for speeding up queries after the first fun (rather than reading the JSON file each time from disk). To force reading technologies data from disk for an analysis, you can use:
Wappalyzer.analyze "codewithsense.com", refresh: true
If you already have the data for a page, you can:
# This does not spawn a browser process, but does not score technologies
# on the page via Javascript detection.
Wappalyzer.analyze "codewithsense.com", page: {
meta: { generator: ['WordPress'] },
headers: { server: ['Nginx'] },
scripts: ['jquery-3.0.0.js'],
cookies: { awselb: [''] },
html: '<div ng-app="">'
}
If you want to score technologies on the page with Javascript detection, you
should pass in Ferrum
instance for browser (or page):
page.goto("codewithsense.com") # inside your pipeline
# browser will navigate to the URL being analyzed if thats not the current page.
Wappalyzer.analyze "codewithsense.com", page: page
The above option can also be used to provide your own browser object with custom configuration options, e.g.
# will navigate to codewithsense.com
Wappalyzer.analyze "codewithsense.com", page: Ferrum::Browser.new(
timeout: 30, js_errors: false, window_size: [1366, 768],
browser_options: {
'ignore-certificate-errors' => nil,
'no-sandbox' => nil,
'disable-gpu' => nil,
'allow-running-insecure-content' => nil,
'disable-web-security' => nil,
'user-data-dir' => ENV.fetch('CHROMIUM_DATA_DIR', '/tmp/chromium')
}
) # default options
To get data regarding all available technologies, use:
Wappalyzer.data
technologies.json
will be downloaded and cached automatically on first analysis.
You can manually update the cached json, using:
Wappalyzer.update! # caches to ~/.wappalyzer.json
You can customize the rules before they are cached to disk. To do so, you can use:
# add the last category to all technologies
# NOTE: this block will also be run for `ours` option.
Wappalyzer.update! do |name, data, categories|
data[:categories] += categories.last
data # do not forget to return this back
end
If you prefer to store the downloaded technologies.json
at a custom path, you should
pass the path to both the above commands:
Wappalyzer.update!(cache: "/path/to/some.json")
# after:
Wappalyzer.data(cache: "/path/to/some.json")
Wappalyzer.analyze("codewithsense.com", cache: "/path/to/some.json")
If for some reason, Wappalyzer updates the URL for this json file, or if you want to use
an alternate technologies.json
, you can use:
Wappalyzer.update!(remote: "https://url.to/a/different.json")
You can also provide a local json/file that contains some additional rules:
Wappalyzer.update!(ours: "/path/to/additional/rules.json")
json = '{"technologies":{"DAN Domains":{"cats":[30],"meta":{"author":"DAN.COM"},"website":"https://dan.com","description":"DAN.COM"}}}'
Wappalyzer.update!(ours: json)
Finally, o'course, you can combine all the options above for update!
method. Remember to use cache
option for analyze
and data
method too,
if you are using a custom path.
After checking out the repo, run bin/setup
to install dependencies. Then,
run rake spec
to run the tests. You can also run bin/console
for an
interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
.
To release a new version, update the version number in version.rb
, and then
run bundle exec rake release
, which will create a git tag for the version,
push git commits and tags, and push the .gem
file to
rubygems.org.
Bug reports and pull requests are welcome on GitHub at https://github.com/nikhgupta/wappalyzer. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.
The gem is available as open source under the terms of the MIT License.
Everyone interacting in the Wappalyzer project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.