gavinr/query-all-features

Writing large results to file

Closed this issue · 6 comments

Thank you for writing this! This is great. I've been using it for a few datasets and it works great.

However, I'm now trying to query all of large dataset (tax parcels): https://map9.incog.org/arcgis9wa/rest/services/Parcels_TulsaCo/FeatureServer/1.

There are several hundred thousand features in this.

I'm wondering what might be a good way to handle this? Maybe the easiest would be an optional callback where a user could intercept the new page of results and incrementally write to a file on their own.

There are also streams, but I'm not as familiar with those.

Open to trying out some approaches but wanted to get your thoughts on this!

Thanks for the comment/question! For the use case of using this large data set, a few questions:

  1. are you in a browser or using NodeJS? I assume NodeJS since you mention "write to a file"
  2. What happens when you try to use the library with this large service? Does the browser or NodeJS just crash (out of memory)?

Thanks for the comment/question! For the use case of using this large data set, a few questions:

  1. are you in a browser or using NodeJS? I assume NodeJS since you mention "write to a file"
  2. What happens when you try to use the library with this large service? Does the browser or NodeJS just crash (out of memory)?

Hi!

  1. yup, node!
  2. yup, it crashes (out of memory). Sorry, I should've mentioned the actual problem!

Matt

Okay maybe it's a little simpler. This seems to work for my purposes:

const writeStream = fs.createWriteStream('./object.json', { flags: 'w' })

const JsonStreamStringify = require('json-stream-stringify');
const jsonStream = new JsonStreamStringify(data);

jsonStream.pipe(writeStream)
jsonStream.on('end', () => console.log('done '))

Not sure whether it warrants a new feature but could be nice..

https://stackoverflow.com/questions/65385002/create-big-json-object-js

I'm having better luck using ogr2ogr:

ogr2ogr filename.geojson -f GeoJSON "https://map9.incog.org/arcgis9wa/rest/services/Parcels_TulsaCo/FeatureServer/1/query?where=objectid>0&outfields=*&f=json&resultRecordCount=2000" ESRIJSON -s_srs EPSG:3857 -t_srs EPSG:4326

As far as I can tell, it includes a pagination feature: https://gdal.org/drivers/vector/esrijson.html#open-options

Of course, this requires installing the enormous gdal dep...

Will keep tinkering and seeing what I find!

I just ran https://map9.incog.org/arcgis9wa/rest/services/Parcels_TulsaCo/FeatureServer/1 using the script here: https://github.com/gavinr/query-all-features/blob/master/debug/node-test.js and it was successful. Can you please tell me more about the replication case where it's erroring for you?

image