vtsykun/packeton

How to use s3 bucket/persistence storage for storing reposiroty files

mlalwani6191 opened this issue · 5 comments

Hi,
When we submit a package in Packeton it gets stored at /data/composer/cache/repo inside the container. Is there a way to use an s3 bucket or any other persistent store to avoid loss of data?

Thanks

Hi
S3 not supported, Packeton use composer/composer which uses local filesystem cache.

But you can mount EBS volume to instance filesystem (if it AWS) for persistent storage and use with docker volumes. /data is defined in Dockerfile as data volume.

services:
    packagist:
       ...
        volumes:
            - /mnt/storage/data:/data
xvilo commented

We are investigating if Packeton is a good alternative for the (most likely) now dead Repman project. However, we do intend to run Packeton on Kubernetes, in such case (and taking scaling into account), it would be very beneficial to have it support S3 storage. Or some sort of replication between Kubernetes pods and a central storage place.

ReadWriteMany volumes are definitely not standard available on a lot of storage implementation, and I could not find a helm-chart available for Packeton, any ideas on how to run it best, or if we can implement S3/Object Storage?

In this case S3 is not supported due to composer lib limitation. Let me explain a few details.
By default /data/composer is composer HOME for docker (you may change it with using env)

/data/composer/cache/repo - this directory created by Composer and contains the composer.json cache for each sha1 reference. It is needed to speed up the update of the package metadata when someone push a tag/commit and reduce oauth API calls. If you delete this cache it will make the update slower, but it will work.

/data/composer/cache/vsc - this directory created by Composer and uses when VCS repo was directly cloned with ssh key. So we can not move it to abstract storage, because composer uses process to execute the commands like this git show-ref --tags --dereference locally.

/data/zipball/ - this directory used for storage a cache for fast download the zipped packages. I think that we can move it to S3 if replace symfony filesystem to abstract adapter league/flysystem. As I know Repman always creates archives for each releases, but Packeton may lazy load archive on the fly when it requested by user (it also clean up the unused archive cache by cron). In some cases, downloading the archive directly from Github/Gitlab api with using oauth or from git archive --format=zip will be faster than getting it via S3 adapter, so S3 is optional in this case too

xvilo commented

So in short, scaling horizontally is not possible unless you set-up network storage for those paths?

I am thinking scaling horizontally is possible. Need only share the lock and redis LOCK_DSN, REDIS_URL. It may not be a clean way, but composer filesystem cache may be inconsistent between nodes. But it won't affect anything. /packages.json Composer metadata get from database and redis at the user's request without cache