Maven, Java 1.7+, PostGIS
Clone the master branch of this repo
git clone git@github.com:cheetah90/wikibrain.git
export JAVA_OPTS="-d64 -Xmx16000M -server"
make sure you have a host with RAM > 16G. Set the Xmx higher if you have more RAM.
Edit wikibrain-core/src/main/resources/reference.conf
dao:dataSource:default: psql
dao:dataSource:psql: put in the username and password for postgres
spatial:dao:dataSource:default: postgis
spatial:dao:dataSource:postgis: put in the username and password for postgres
At the project root (/wikibrain) run mvn -f wikibrain-utils/pom.xml clean compile exec:java -Dexec.mainClass=org.wikibrain.utils.ResourceInstaller
According to https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
listen_addresses = '*'
max_connections = 500 # Must be at least 300
shared_buffers = 48GB # Should be 1/4 of system memory
effective_cache_size = 96GB # Should be 1/2 of system memory
fsync = off
synchronous_commit = off
checkpoint_segments = 256
checkpoint_completion_target = 0.9
autovacuum = off
./wb-java.sh org.wikibrain.Loader -l en -s wikidata -s spatial
(Only running the above script will get the Atlasify running but with limitted function. Good for a feasibility test.)
./wb-java.sh org.wikibrain.Loader -l en -s wikidata -s spatial -s sr
./wb-java.sh org.wikibrain.sr.SRBuilder -l simple -m ensemble -o both
If SSL Certificate error occurs, you need to add the certificate from dump.wikimedia.org to the java keystore
To download the cert
echo -n | openssl s_client -connect dumps.wikimedia.org:443 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > ~/dumpswikimedia.cert
To add it to java cacert, first locate the java cacerts and then, in that directory, add the downloaded cert to cacerts
keytool -keystore cacerts -importcert -alias dumpswikimedia -file [dumpswikimedia.cert]
Edit atlasify/src/main/java/org/wikibrain/atlasify/AtlasifyLauncher.java
. set externalURL
and portNo
and helloWorldUrl
according to the information of the host. These are the URL and PortNo for the wikibrain backend. Wikibrain needs its own port so make sure this port is open through the firewall.
Host https://github.com/cheetah90/Atlasify with your favorite http server (e.g. Apache)
change the baseURL and featureArticleURL in atlasify.js based on the host info. Minimally, you just need to change the server name.
run ./wb-java.sh org.wikibrain.atlasify.AtlasifyLauncher
Open index.html to try if everything works. Note: run a query first -- and then the back-end will start loading. Wait till the loading finishes to try another query.
Option 1 (recommended): Local database
Ingest the Simple English edition of Wikipedia to your local database
Option 2: SSH tunneling plus local copy of the intermediary files
Opening up the 5432 on server to receive all requests and copy the wikibrain
rooy folder to local
Issue 1: java.lang.NoClassDefFoundError issue
In IntelliJ, File->Project Structure->Modules->Dependencies In dependencies tab, change the “scope” from “Provided” to “Compile”
On the server, git clone
the front end repo and change the js/atlasify.js
file. Copy the repo to /var/www/html
On the server, git pull
in the wikibrain directory.