/Darkweb-search-engine

Dark Web & Deep Web Search Engine. Data Crawler and indexer for Darkweb , OSINT Tools for the Dark Web

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

Nexvision search engine

Features

  • Crawls the darknet looking for new hidden service
  • Find hidden services from a number of clear net sources
  • Optional full-text Elasticsearch support
  • Marks clone sites of the /r/darknet super list
  • Finds SSH fingerprints across hidden services
  • Finds email addresses across hidden services
  • Finds bitcoin addresses across hidden services
  • Shows incoming / outgoing links to onion domains
  • Up-to-date alive/dead hidden service status
  • Portscanner
  • Search for "interesting" URL paths, useful 404 detection
  • Automatic language detection
  • Fuzzy clone detection (requires Elasticsearch, more advanced than super list clone detection)

Components

Elasticsearch

Elasticsearch cluster consists of 2 Elasticsearch instance for HA and load balancing. The scrapped page data is stored and searched.

Kibana

It runs on port 5601 and can be used to check the data in Elasticsearch

Web-General

The web interface for domain search engine. It runs on port 7000

MySQL

It stores the domains, page urls, bitcoin addresses, etc.

TOR Proxy

Used to access the onion pages. There are 10 proxy containers deployed and HAProxy is used to distribute the traffic.

Scraper

It gets the domain list from MySQL DB, harvest pages and new domains from onion domains through TOR proxies and stores the domains and page data in Elasticsearch and MySQL. Based on Python Scrapy framework.

Installation

Clone the project and build docker images involved in docker-compose.

docker-compose build
docker-compose up -d

Build and run the scraper.

docker build --tag scraper_crawler ./

Run the scraper.

docker run -d --name darkweb-search-engine-onion-crawler --network=darkweb-search-engine_default scraper_crawler /opt/torscraper/scripts/start_onion_scrapy.sh

After first deployment, need to initialize the indexes on Elasticsearch.

docker exec darkweb-search-engine-onion-crawler /opt/torscraper/scripts/elasticsearch_migrate.sh

Import initial domain list

docker exec darkweb-search-engine-onion-crawler /opt/torscraper/scripts/push_list.sh /opt/torscraper/onions_list/onions.txt &