ohwgiles/laminar

Storing list of artifacts in database

KAction opened this issue · 5 comments

Currently, Laminar lists $ARCHIVE directory of particular run every time to display links on the top:

    KJ_IF_MAYBE(dir, fsHome->tryOpenSubdir("archive"/runArchive)) {
        for(kj::StringPtr file : (*dir)->listNames()) {
            kj::FsNode::Metadata meta = (*dir)->lstat(kj::Path{file});

which means I can't move files in $ARCHIVE elsewhere without loosing these links. And want to move these files to cheaper storage.

So I suggest that Laminar saves list of artifacts right after job finishes, and it is up to reverse proxy to figure out where to find
laminar.example.com/archive/foo-job/10/debug.txt. What do you think?

The original idea supports moving archived artefacts to cheaper storage but assumed this would be achieved by mounting or symlinking the archive directory appropriately. It's simpler to just dynamically iterate the folder, but iterating is more expensive especially on slow storage or if there are many artefacts. I'm not opposed to your suggestion, just wanted to check why mounting/symlinking would not work for you since this proposal has some (admittedly low) added complexity versus the current situation

If I want to archive artifacts on S3, mounting them so Laminar finds them means FUSE -- already extra complexity. Furthermore,
scanning S3 to render a job page (list of artifacts on the top) is both slow and costly.

Technically, I can keep empty files in /laminar/archive to inform Laminar about what artifacts are associated with the job, yet configure the reverse proxy to go to S3 instead, but that means using the filesystem as a database. Huge pain to back up. readdir(3) won't be happy.

Fair enough

@mitya57 are you still offering a PR for this? I'm happy with the justification

Sorry for the late response.

@mitya57 ended up with postgres-only fork. https://github.com/mitya57/laminar/tree/wip/postgres

That was necessary to speed up the things by taking advantage of Postgres materialized views and other nice features. Patch to keep the artifact list in the database ended up tightly coupled to other changes, so, I guess we can close the issue.