/nifi-extracttext-processor

Apache NiFi Custom Processor Extracting Text From Files with Apache Tika

Primary LanguageJavaApache License 2.0Apache-2.0

nifi-extracttext-processor

Apache NiFi Custom Processor Extracting Text From Files with Apache Tika

See my article and example here:

https://community.hortonworks.com/articles/163776/parsing-any-document-with-apache-nifi-15-with-apac.html

Try this setup https://community.hortonworks.com/storage/attachments/56409-tika.xml

https://community.hortonworks.com/articles/81694/extracttext-nifi-custom-processor-powered-by-apach.html

For the latest version see here:

https://community.hortonworks.com/articles/177370/extracting-html-from-pdf-excel-and-word-documents.html