New APIs to be added.
Edu4rdSHL opened this issue · 9 comments
Dear users, please put in comments APIs that you think should be added to findomain
, it will help me a lot to improve the tool.
Note: what make findomain unique is that it only use APIs and doesn't do searchs in Google, etc. that's the secret why it's so faster. I haven't plans to add that to findomain then put only APIs (post or get) here still if they're not directly relationed with Certificate Transparency logs but can be used to discover subdomains.
Please, the following APIs are already implemented, make sure that the API that you want is not in that list:
Pull requests are more than welcome
- BinaryEdge
- AlienVault
- WaybackMachine
- CommonCrawl
- PassiveTotal
- ThreatCrowd
- GoogleCT
- Riddler
- Censys
- HackerTarget
- ArchiveToday
- ArchiveIt
Hey @dtnml thanks a lot! Can you give reference links to the APIs documentations? That will make me easy to just read and start working in the implementations instead of looking for docs.
Hi, Report URI just announced Certificate Transparency monitoring as well.
Might be worth taking a look and see how it can be added here.
https://scotthelme.co.uk/announcing-ct-monitoring-for-report-uri/
-
Shodan
https://developer.shodan.io/
https://developer.shodan.io/api
https://github.com/achillean/shodan-python
https://github.com/random-robbie/My-Shodan-Scripts -
SecurityTrails
https://securitytrails.com/corp/api
https://docs.securitytrails.com/docs
https://github.com/secops4thewin/securitytrails-python
https://github.com/dev-cyprium/ExTrails
https://github.com/FuckAllWorld/securitytrails -
AnubisDB
https://jonlu.ca/anubis/subdomains/google.com(implemented in e3f052c)
Hi @Edu4rdSHL,
Add Support for: threatcrowd.org
https://www.threatcrowd.org/searchApi/v2/domain/report/?domain=google.com
Add Support for: certificatedetails.com
https://certificatedetails.com/api/list/google.com
Add support for: transparencyreport.google.com
https://transparencyreport.google.com/transparencyreport/api/v3/httpsreport/ct/certsearch/page?p=google.com
https://transparencyreport.google.com/transparencyreport/api/v3/httpsreport/ct/certsearch?include_expired=true&include_subdomains=true&domain=google.com
https://www.google.com/transparencyreport/api/v3/httpsreport/ct/certsearch?domain=google.com
Add Support for github.com
https://api.github.com/search/repositories?q=google.com
https://gist.github.com/search?utf8=%E2%9C%93&q=google.com
Add Support for: netcraft.com
https://searchdns.netcraft.com/?restriction=site+ends+with&host=google.com
class NetcraftEnum(enumratorBaseThreaded):
def __init__(self, domain, subdomains=None, q=None, silent=False, verbose=True):
subdomains = subdomains or []
self.base_url = 'https://searchdns.netcraft.com/?restriction=site+ends+with&host={domain}'
self.engine_name = "Netcraft"
self.lock = threading.Lock()
super(NetcraftEnum, self).__init__(self.base_url, self.engine_name, domain, subdomains, q=q, silent=silent, verbose=verbose)
self.q = q
return
def req(self, url, cookies=None):
cookies = cookies or {}
try:
resp = self.session.get(url, headers=self.headers, timeout=self.timeout, cookies=cookies)
except Exception as e:
self.print_(e)
resp = None
return resp
def get_next(self, resp):
link_regx = re.compile('<A href="(.*?)"><b>Next page</b></a>')
link = link_regx.findall(resp)
link = re.sub('host=.*?%s' % self.domain, 'host=%s' % self.domain, link[0])
url = 'http://searchdns.netcraft.com' + link
return url
def create_cookies(self, cookie):
cookies = dict()
cookies_list = cookie[0:cookie.find(';')].split("=")
cookies[cookies_list[0]] = cookies_list[1]
# hashlib.sha1 requires utf-8 encoded str
cookies['netcraft_js_verification_response'] = hashlib.sha1(urllib.unquote(cookies_list[1]).encode('utf-8')).hexdigest()
return cookies
def get_cookies(self, headers):
if 'set-cookie' in headers:
cookies = self.create_cookies(headers['set-cookie'])
else:
cookies = {}
return cookies
def enumerate(self):
start_url = self.base_url.format(domain='example.com')
resp = self.req(start_url)
cookies = self.get_cookies(resp.headers)
url = self.base_url.format(domain=self.domain)
while True:
resp = self.get_response(self.req(url, cookies))
self.extract_domains(resp)
if 'Next page' not in resp:
return self.subdomains
break
url = self.get_next(resp)
def extract_domains(self, resp):
links_list = list()
link_regx = re.compile('<a href="http://toolbar.netcraft.com/site_report\?url=(.*)">')
try:
links_list = link_regx.findall(resp)
for link in links_list:
subdomain = urlparse.urlparse(link).netloc
if not subdomain.endswith(self.domain):
continue
if subdomain and subdomain not in self.subdomains and subdomain != self.domain:
if self.verbose:
self.print_("%s%s: %s%s" % (R, self.engine_name, W, subdomain))
self.subdomains.append(subdomain.strip())
except Exception:
pass
return links_list
Hello, Threatcrowd is already implemented: https://github.com/Edu4rdSHL/findomain/blob/eda29344fe014f1a0034ededfc74b4daa941aa8e/src/lib.rs#L492-L500
certificatedetails.com doesn't provide a valid JSON.
transparencyreport.google.com doesn't provide valid JSONs (the first one provides a JSON but it says error).
Github and Netcraft also doesn't provide valid JSONs.
I will ONLY add APIs that reply with a proper JSON output like https://jonlu.ca/anubis/subdomains/google.com.
I just looked at the WaybackMachine API, that should work with the following URLs...
http://web.archive.org/cdx/search/cdx?url=example.com*&output=txt
http://web.archive.org/cdx/search/cdx?url=example.com*&output=json
http://web.archive.org/cdx/search/cdx?url=example.com*&output=txt&limit=999999
https://github.com/internetarchive/wayback/tree/master/wayback-cdx-server#filtering
Closing, new APIs requests should be in new issues for easy management.