/file-observatory

Single server/laptop grade file-observatory

Primary LanguageJavaApache License 2.0Apache-2.0

File Observatory

This repo hosts development code used on the backend to support data ingestion into an ElasticSearch index for the SafeDocs File Observatory app.

This repo contains pre-ALPHA grade code for demonstration purposes only.

Some capabilities demonstrated within have been integrated into Apache Tika. Some have been spun off into standalone projects, e.g. commoncrawl-fetcher-lite.

Attribution

The commoncrawl-fetcher module includes code that relies on GeoLite2 data created by MaxMind, available from https://www.maxmind.com.