/text-mining

Java based library that will extract text from Microsoft Word for Windows binary documents including Word 1.0/2.0/4.0/6.0/95/97/2000/xp/2003. Extracts text from fast-saved files as well.

Primary LanguageJavaGNU Lesser General Public License v2.1LGPL-2.1

text-mining

Java based library that will extract text from Microsoft Word for Windows binary documents including Word 1.0/2.0/4.0/6.0/95/97/2000/xp/2003.

Extracts text from fast-saved files as well.

Initially imported from : https://code.google.com/archive/p/text-mining/source/default/source

This version has the following improvement compared to the legacy project :

  • compatible with Apache POI (version 3.17)
  • mavenized project
  • requires Java 8
  • use of generics

This version is provided AS IS and is NOT actively maintained by Jalios.