pnorman/ogr2osm

Memory error

devemux86 opened this issue · 4 comments

Hi Paul,

I try ogr2osm with a large shp file of a country detailed road network and python stops at 'Parsing data' with 'MemoryError'.
The shp file has size: shp is 180MB and dbf is 1.4GB with over 1.1 million records.
I use Python 2.7.3 32bit at Win7 64bit.

I only managed to produce the osm by using the java shp-to-osm https://github.com/iandees/shp-to-osm which produces many osm files having as option the max nodes per osm file.

At http://wiki.openstreetmap.org/wiki/Ogr2osm I read "In March 2010 Ivansanchez was reported to be working on a revamped version of ogr2osm that would be much slower (10x) but would hold all the data in a SQLite database instead of in memory. pnorman's ogr2osm will work on very large files, given enough ram."

Is there actually a version which works with very large files?
What are the file limits of your version or any way to process large files?

Thanks!

Does it give an error message with a line?

ogr2osm will happily output files larger than the available physical memory without slowing down from swap. I regularly produce 25GB+ .osm files with it with 16GB of physical memory. If you're prepared to use swap there is no known limit to the size of files it will handle, but it will get slower.

How much physical memory do you have and what size is your Windows pagefile?

You can also try --no-memory-copy which will free up a bit of memory at the cost of less flexibility with translations.

Keep in mind what you're going to do with this file - if you generate a multi-gig OSM file you won't be able to use it with any of the normal tools.

The error message is:

Traceback (most recent call last):
File "d:/Programs/ogr2osm/ogr2osm.py", line 653, in
parseData(data)
File "d:/Programs/ogr2osm/ogr2osm.py", line 352, in parseData
parseLayer(translations.filterLayer(layer))
File "d:/Programs/ogr2osm/ogr2osm.py", line 413, in parseLayer
parseFeature(translations.filterFeature(ogrfeature, fieldNames, reproject), fieldNames, reproject)
File "d:/Programs/ogr2osm/ogr2osm.py", line 429, in parseFeature
feature = Feature()
File "d:/Programs/ogr2osm/ogr2osm.py", line 325, in init
features.append(self)
MemoryError

I have 8GB physical memory at Win7 64bit and 12GB pagefile.
I will try with Python 64 bit.

Did you try 64 bit?

The biggest reason for ogr2osm's memory usage is the need to combine duplicate nodes. This requires staging all the ways nodes and relations in memory, processing them, then serializing them to XML.

Given that ogr2osm can generate XML files 50% larger than the physical memory and these files are larger than most software that would sensibly consume them could handle, I don't want to switch to a database-backed node store given it would drastically increase the run time of large files.

Hi Paul,

Yes I tried Python 64 bit but while it was better, still it is very slow and finally stops with out of memory error.
So for now, with large shp files (like country road networks) I use java shp-to-osm which is very fast and seems to parse/load the shp file with GeoTools library feature by feature.
For the record I process the produced osm files with osmosis tool in order to use them with mapsforge android library.

Thanks