/serverless-libreoffice

Run LibreOffice in AWS Lambda to create PDFs & convert documents

Primary LanguageDockerfile

Serverless LibreOffice

👉🏻 Read the blog post on Medium: How to Run LibreOffice in AWS Lambda for Dirty-Cheap PDFs at Scale 👈🏻

Show Me the Code

This repo contains code used to run the online demo.

├── compile.sh  <-- commands used to compile LibreOffice for Lambda
├── infra       <-- terraform config to deploy example Lambda
│   ├── iam.tf
│   ├── lambda.tf
│   ├── main.tf
│   ├── s3.tf
│   └── vars.tf
└── src         <-- example Lambda function node in Node.js used for website demo
    ├── handler.js
    ├── libreoffice.js
    ├── logic.js
    ├── package.json <-- put lo.tar.gz in this folder to deploy. Download it below
    └── s3.js

Compiled and ready to use archive can be downloaded under Releases section. Also check out NPM package with bundled LibreOffice for Lambda (85 MB).

✨ Check out a new Lambda Layer with LibreOffice!

How to compile by yourself

Check out a comprehensive step-by-step tutorial from 0 to deployed function.

To run this, you will need to Docker and docker-compose installed.

  1. Install and configure Docker and docker-compose locally or on a c5.2xlarge spot instance with ~ 8 GB (the default) of storage attached.
  2. In a terminal, run docker-compose run --rm libreoffice. It will compile LibreOffice and then copy layers.zip to your local drive.

Help

Related Projects

How To Help

Reduce Cold Start Time

Currently ƛ unpacks 109 MB .tar.gz to /tmp folder which takes ~1-2 seconds on cold start.

Would be nice to create a single compressed executable to save unpack time and increase portability. I tried using Ermine packager and it works!! But unfortunately this is commercial software. Similar open-source analogue Statifier produces broken binaries.

Maybe someone has another idea how to create a single executable from a folder full of shared objects.

UPD: TODO: Check out node-packer and libsquash (no FUSE required!)

Further Size Reduction

I am not a Linux or C++ expert, so for sure I missed some easy "hacks" to reduce size of compiled LibreOffice.

Mostly I just excluded from compilation as much unrelated stuff as possible. And stripped symbols from shared objects.

Here is the list of: available RPM packages and libraries available in AWS Lambda Environment, which can be helpful.

You can also use multi compression level, with upx and then decompress after brotli.

Testing

Update repo for testing. Return before S3 for example, hardcode or generate files to convert and setup variables. Then simply run:

docker run \
 -v "\$PWD":/var/task \
 lambci/lambda:nodejs12.x src/handler.handler

After successful execution, get the resulted files to check the pdfs.

docker ps -a

Find exect container id.

Then execute

docker cp containerId:/tmp/filename.pdf ./filename.pdf

Then check your results locally

License

MIT © Vlad Holubiev