RMLio/rmlmapper-java

Memory usage comparison between 6.2.1 and 6.0.0

nicolastoira opened this issue · 2 comments

I'm using the rmlmapper to map JSON data to RDF turtle data. I do not use any specific option except for the mapping file, output file, and serialization format.

I recently moved from version 6.0.0 to version 6.2.1 and I noticed that the latest version as a much larger memory usage with respect to the other version. I tried with a 1.2GB JSON file and for the latest version it fails quite quickly with Killed message due to memory.

As far as you know, is there any new features, code logic that consumes more memory in the latest version with respect to the older one? Do you have any recommendations in terms of maximum input data size? It seems that in the 6.2.1 version the ingested data is loaded multiple times into memory. Anything that I can manually get rid of for my special use case of JSON to RDF conversion?

Let me know if you have any recommendations regarding this issue. Thank you.

We upgraded several libraries between these versions, that's the most significant change between these versions.

A path forward would be analyzing the memory usage with a profiler with the data you have.

Since there was no response in the last 9 months on this issue, I will close it.
Please re-open or comment on this issue if it needs to be re-opened.