ONSdigital/address-index-data

java.lang.OutOfMemoryError: Java heap space

Closed this issue · 2 comments

I'm running it inside from Intellij IDEA and using real data.
The format of data is CSV and the file size is more than 8GB.
So I met java.lang.outOfMemorryError: Java heap space
How to resolve it?

Likely your setup can't handle the load.
Try a smaller sample of test data.

It may not be possible to run with the full ABP data on a developer machine. Our live Spark job runs on a large Cloudera cluster and it gobbles up 8 executors each with 5 CPU cores and 8gb RAM. A cloud-based alternative would be something like Dataproc https://cloud.google.com/dataproc