OOM for large polygon
Closed this issue · 0 comments
https://github.com/nci/gsky/blob/master/processor/tile_grpc.go#L27 sets the output channel to contain up to 100 rasters without blocking. In most WMS settings, due to small requested polygon size, there is virtually no chance we end up even nearly 100 rasters returned from gdal workers. But in WCS settings, users often want to study large region of interest. For example, a WCS request sent by @juan-guerschman to study Australia as a whole in full resolution, which is a fairly common scenario. http://gsky ip/ows?SERVICE=WCS&service=WCS&crs=EPSG:4326&format=GeoTIFF&request=GetCoverage&height=7451&width=9580&version=1.0.0&bbox=110,-45,155,-10&coverage=global:c6:monthly_frac_cover&time=2018-03-01T00:00:00.000Z
In this scenario, Gsky fills the channel irrespective of the physical memory size. If the server doesn't have enough memory, ows process will get hit by OOM kill.
A quick fix will be like this: we know as inputs the requested width and height. We then try to initialize the channel as a function of total free memory divided by (width x height). If memory is too low, we reject the request.
The above fix is a greedy algorithm that will not find optimum in terms of concurrent processing and memory allocation but might be good enough in practice. This problem is essentially a packing problem (https://en.wikipedia.org/wiki/Bin_packing_problem), which can be NP-hard. Finding a good long-term solution will be left to future work.