abhishek-ram/django-pyas2

Disabling local file storage

lakemist opened this issue · 10 comments

Just started using pyas2. Thanks for all the work.

I was wondering if there is a way to disable all local storage. I am using django-storages to store FileFields to S3. It seems the calls to "store_file" are storing copies of data that are already stored in the models. In my case, the server's file system is ephemeral.

I have chosen to save the files using store_file for a reason.

There are two places that files are stored when pyAS2 sends or receives files. The first is the "messages/__store" where the payload and headers are saved, I have used a FileField here so django uses the configured StorageBackend to save the files. This store is meant to be a backup for the files sent/received.

The second place is the DATA_DIR where the payload is placed/picked up and here I have assume that the application that interacts with pyAS2 is present locally and hence the store_file writes these to the local folders.

I could make the store_file use the file storage API but I am curious to know how you are planning to integrate pyAS2 with the rest of your application stack?

My application stack is hosted on Kubernetes. AS2 is one of many microservices that can be configured based on the client integration choices. The services are monitored by Kubernetes and the underlying pods may be replaced without notice when an underlying issue is detected.

In this context, creating persistent storage on each pod is not practical. A local filesystem dependency makes it hard to scale the service across more than one pod, for example, which may be needed to avoid a single point of failure.

That makes sense @lakemist but how are you planning on integrating pyAS2 with the rest of your applications. I know that if I update store_file to use Storage Class you can then pick the files from S3/GCS or whatever but what is the plan for sending files. What did you have in mind?

I had a similar use case, where I needed 2 instances of AS2 for redundancy. The workaround was to use a network file system mounted on the DATA_DIR and messages paths. This did not need a change in the code.

I also use persistent volume share with Kubernetes.

@abhishek-ram My setup: pyAS2 app is integrated into a small Django project that provides internal REST views to query, send and receive the files processed by pyAS2. The project also has the ability to send messages when as2 events occur (for future use).
So, my use case doesn't include much use of the admin commands. We do use the admin send action for testing.

If the app is changed to use either Django's DEFAULT_FILE_STORAGE or a new PYAS2_FILE_STORAGE setting, for example, I think we would get the best of both worlds. There would be minor changes to make to the send commands. Perhaps have a send_from_local flavor for those who need to send interactively from the local filesystem. The send as bulk can still send files from the outbox, but would use list_dir on the outbox instead of glob.glob(outbox_folder + '/*'), etc.
https://docs.djangoproject.com/en/3.0/ref/files/storage/#django.core.files.storage.Storage.listdir

The goal is to be 100% backward compatible. So, setting the default value:
setting.FILE_STORAGE=django.core.files.storage.FileSystemStorage
should produce the exact same file structure on disk.

Maybe you can see other hurdles that I'm not?

@lakemist I like your idea. I will try implementing this over the weekend

Very nice. Thanks. I am willing to contribute. Just lmk.

I have release a new version that supports cloud storage. Check it out

Looks good!