/docker-NorconexHttp

Norconex HTTP Connector with the SQL Committer

Primary LanguageShellApache License 2.0Apache-2.0

docker-NorconexHttp

Norconex HTTP Collector with the SQL Committer

Includes the following JDBC drivers:

  • PostgreSQL version 42.2.5
  • MySQL version 8
  • Apache Cassandra (also works for ScyllaDB), maintained by DBSchema

The Norconex HTTP Collector is a spider/crawler written in Java. It uses a fairly low footprint and is highly customizable simply by extending the provided java classes. By using the SQL Comitter the data found by the collector can be imported into a database that supports JDBC.

Feel free to fork the repository and create your own docker image from: https://github.com/rhessing/docker-NorconexHttp

This repo is used by docker hub for automated builds.

The default configuration is generated on the fly when the container starts except if the default.xml configuration file already exists.

When using this docker image please mount your own classes directory on top off: /opt/collector-http/classes To change the settings please mount your own configuration on top off: /opt/collector-http/config/default.xml