This is to set up your own Parlameter with the data from the Parlamint corpus. Once you have the below requirements installed you only need to run a single command and everything will be set up for you.
The script automatically downloads data from the Clarin repository Multilingual comparable corpora of parliamentary debates ParlaMint 2.1 for the specified country.
After it's done (actually, even before it's done) you'll be able to access Parlameter website at http://localhost:3066/ and the API at http://localhost:8000/. It will also automatically set up a PostgreSQL database for you as well as a SOLR instance.
All of the commands run inside containers so they will not pollute affect your
local setup.
- make sure you run
git submodule update --init --recursive
to get all the code - Linux
- Docker
- docker-compose
- an internet connection
Just run bootstrap.sh
with your country code of choice as the first argument
and follow the prompts. You'll have to enter your sudo password for docker to
be able to run its commands.
Example bootstrapping of Slovenian data
$ ./bootstrap.sh SI
No problem, just override lemmatize_many
in parladata/parlacards/scores/common.py
.
No problem, just edit parlasearch/solr/parlasearch-conf/schema.xml
and
parlasearch/solr/parlasearch-conf/solrconfig.xml
.
Peek into bootstrap.sh
. You'll find each step sorrounded by explanatory
echo
calls.