Lex-2008/backup3

Rework backup.sh pipelines

Closed this issue · 3 comments

Currently, output of comm gets separated into "new" and "deleted" files, which are fed into four "{operate,sql} on {old,new} files" loops. Instead, it should be like this:

output of comm is cleaned [*] goes into two pipes: for files and for sql

  • "for sql" pipe get processed by sed and pipes into sqlite
    • sed builds sql commands for both new and old files
    • sqlite can use transaction here
  • "for files" pipe:
    • either split into two: "new files" and "old files" and operate on them as before
    • or sed expressions to build shell commands to operate on files (note that this will likely require $BACKUP_LIST file to contain file sizes, too, since we add them to database. But maybe that's good thing?)
  • [*] cleaning output of comm - remove filenames with ", remove inode numbers, maybe replace tabs with simpler way of distinguishing new/deleted files (N /D as before).

Sorting output of comm has pros and cons:

  • good for grouping operations in one directory together instead of running around
  • bad since we need to wait for comm to finish

but seems to have small impact in real life anyway

Original issue being addressed in del-date-in-filename branch:

instead of read; case use a sequence of sed (and maybe xargs) commands.

Hopefully it will increase speed, but it also will fix working with filenames ending with space, like "name " - read trims last space away in this case.

Updating the issue for one sqlite pipeline instead of two parallel ones

Will probably get rid on file sizes and obsolete #8

Implemented