/apache-tika-file-diff

Uses Apache Tika parser libraries to extract text out of a variety of file formats (pdf, excel, word, mhtml, images, txt, csv, etc.) and then uses java-diff-utils to generate unified diff between two versions of the files. This unified diff can be fed to diff2html library to show side by side diff on browser

Primary LanguageJava

Stargazers