AI apps/benchmark for legaltech
DEMO with huggingface, Jina and European Court of Justice judgmentsThis site contains 113 judgments of the European Court of Justice
from 2019 and 2020 concerning tax issues.
All sentences from judgments have been encoded via BERT model
(bert-base-uncased
provided by
Huggingface's
transformers
library), an example
of a very powerful NLP model that has conquered AI applications.
The infrastructure of the search experience is based on
Jina - a wonderful scalable library to design neural search engines,
based on the newest Deep Learning strategies.
The entire concept - as well as Jina and Huggingface - has a great future in legal tech, because lawyers
need to use a lot of documents, and searching among them is highly challenging...
How does it work?
- Write a phrase / sentence
- Click Enter
- You get the most similar sentence (the lower the score, the better)
Enjoy!...
... and be aware that this is a playground. Sometimes BERT doesn't give proper hits,
but sometimes analogies are pretty impressive, like:
QUERY: that complaint was rejected
RESULTS:
- That request was rejected.
- Its application was rejected, as was the objection that it subsequently lodged.
- That request was rejected.
- That is unfair and unlawful.
- That argument cannot be accepted.
I am aiming to test other approaches (like other transformer architectures), and fine-tune it, in order to prepare an ultimate benchmark of AI solutions for legaltech, so stay tuned and follow me at LinkedIn, Twitter and on my blog at inteliLex.
If you would like to test it on more documents and play with the code,
please clone this
git repository and contribute to it.
Feel free to contact me: artur.tanona@gmail.com.
Below please find how to launch it on Ubuntu. If you are working on (for example) Windows 10 you need to have a Docker engine running and you can skip to the last part "Run on Docker"
Upload documents in *.txt
format to search_engine/data
and frontendApp/src/assets
.
export PARALLEL=1
export SHARDS=6
export CLIENT_PORT=80
export TMP_WORKSPACE=test_index
export JINA_PORT=56798
In the search_engine
directory:
python3.7 app.py -t index
gunicorn -w 1 --bind 0.0.0.0:6500 main:app
In the frontendApp
directory:
npm install
ng serve
And you can open the website on http://localhost:4200
You can easily create Docker apps, but you need to set up the proper variables in DockerFiles for each app.