Rework backup.sh pipelines

Question

Rework backup.sh pipelines

Closed this issue 5 years ago · 3 comments

Lex-2008 commented 6 years ago

Currently, output of comm gets separated into "new" and "deleted" files, which are fed into four "{operate,sql} on {old,new} files" loops. Instead, it should be like this:

output of comm is cleaned [*] goes into two pipes: for files and for sql

"for sql" pipe get processed by sed and pipes into sqlite
- sed builds sql commands for both new and old files
- sqlite can use transaction here
"for files" pipe:
- either split into two: "new files" and "old files" and operate on them as before
- or sed expressions to build shell commands to operate on files (note that this will likely require $BACKUP_LIST file to contain file sizes, too, since we add them to database. But maybe that's good thing?)
[*] cleaning output of comm - remove filenames with ", remove inode numbers, maybe replace tabs with simpler way of distinguishing new/deleted files (N /D as before).

Sorting output of comm has pros and cons:

good for grouping operations in one directory together instead of running around
bad since we need to wait for comm to finish

but seems to have small impact in real life anyway

Answer 1 · 2019-09-17T16:10:52.000Z

Original issue being addressed in del-date-in-filename branch:

instead of read; case use a sequence of sed (and maybe xargs) commands.

Hopefully it will increase speed, but it also will fix working with filenames ending with space, like "name " - read trims last space away in this case.

Updating the issue for one sqlite pipeline instead of two parallel ones

Answer 2 · 2019-09-17T22:21:14.000Z

Will probably get rid on file sizes and obsolete #8

Answer 3 · 2019-09-18T16:56:22.000Z

Implemented