developmentseed/geolambda

ogr2ogr and other binaries not in path

Closed this issue · 4 comments

I'm trying to use the public GeoLambda layer in a Node JS lambda function:
arn:aws:lambda:us-west-2:552188055668:layer:geolambda:1

In my lambda function code, I'm trying to call the ogr2ogr command line tool.

import { promisify } from 'util'
const exec = promisify(require('child_process').exec)

export const handler = async (event, _context) => {
  const ogr2ogrCmd = await exec('ogr2ogr')
  console.log('stdout:', ogr2ogrCmd.stdout)
  console.log('stderr:', ogr2ogrCmd.stderr)

  return {
    statusCode: 200,
    body: JSON.stringify(
      {
        message: ogr2ogrCmd.stdout,
        input: event,
      },
      null,
      2
    ),
  }
}

When I run the function it fails saying that ogr2ogr is not found.

Command failed: ogr2ogr
/bin/sh: ogr2ogr: command not found

It also fails when attempting other command line tools like gdalinfo, ogrinfo, etc.

Are these binaries not available by default in GeoLambda, or do I just need to specify specific paths? I've dug through the /opt/ dir and although I see many GDAL-related files, I don't see the binaries.

I'd love to help document how to do this if so.

Hi @tsemerad For some reason I originally thought this would work, but then realized of course the binaries are not in the Lambda layer, nor should they be as it would take up space and the main use case is using the libraries within Lambdas not binaries.

However, this is a perfect use case for creating your own Lambda layer.

Clone this repo and then edit https://github.com/developmentseed/geolambda/blob/master/bin/package.sh to copy the binaries that you need also into the deploy directory

Now you have two choices, you can either deploy this as a Lambda layer on your AWS account that you can reference from your Lambda, or you can just put the resulting package files into your own Lambda deploy package and skip creating a layer for it.

If you want to create a Layer, you can follow the instruction under 'Develop' in the README. You will just need to change the names of things to match your own Layer

If you want to just use a single Lambda, you will just need to add in your Node handler into the resulting zip file that is created from the deploy directory when you package it.

Good luck, post back here if you run into any problems or have additional questions.

Thanks so much @matthewhanson. I made some pretty good progress, but still some problems. I added the following lines to package.sh:

# copy gdal binaries over
rsync -ax $PREFIX/bin/gdal* $DEPLOY_DIR/bin/
rsync -ax $PREFIX/bin/ogr* $DEPLOY_DIR/bin/

My Lambda function now sees ogr2ogr, but it's missing some required libraries. It fails with:

ogr2ogr: error while loading shared libraries: libsqlite3.so.0: cannot open shared object file: No such file or directory

So I added the following to package.sh:

# needed by ogr2ogr
rsync -ax /usr/lib64/libsqlite3* $DEPLOY_DIR/lib/

That fixes the libsqlite3.so.0 error, but then it fails similarly with libexpat.so.1. So I added the following line:

rsync -ax /lib64/libexpat.so.* $DEPLOY_DIR/lib/

Which results in it failing similarly with libtiff.so.5.

I might continue down this rabbit hole, but it feels a little brittle, since rather than pulling from the $PREFIX dir I'm pulling from a variety of system dirs.

Do you know of a better way to copy over all libs needed by the GDAL binaries?

I found a way to do it using the ldd command:

# copy all libs needed by ogr2ogr binary
ldd /usr/local/bin/ogr2ogr | grep "=> /" | awk '{print $3}' | xargs -I '{}' cp -v '{}' $DEPLOY_DIR/lib/

I'll go with this method unless you know of a better one.

Great, glad you got it working.

Using ldd is a good way to do it, I ended up just doing it by trial and error because I didn't want to copy over system libraries that were already in the Lambda runtime container, only what was missing. But if you copy all those over and are still within the deploy size limits then that's fine too. There is some overhead with cold-starting a Lambda that goes up with the package size, but I think most of the extra libraries you may be copying (unless you pruned out common ones some way) are pretty small.

With regard to libtiff, I thought that would already be included as it's needed by libgdal. But perhaps libgdal is using a static version of the library. I'll have to take a look and have it use the dynamic library in order to further optimize for size.

@tsemerad Anything else to add here that might help someone else if they find this thread? If not we can close it.