"# Apache tika + java-diff-utils + diff2html" Examples to follow
bitsevn/apache-tika-file-diff
Uses Apache Tika parser libraries to extract text out of a variety of file formats (pdf, excel, word, mhtml, images, txt, csv, etc.) and then uses java-diff-utils to generate unified diff between two versions of the files. This unified diff can be fed to diff2html library to show side by side diff on browser
Java