internalfx/rethinkdb-regrid

Purging deleted files

Closed this issue · 5 comments

Is there a way to remove a file, so that the space used for it in the database is reclaimed?

Also, is there a way to control how many revisions should be kept?

Not at the Moment...

I think the idea is to eventually provide a CLI tool along with regrid that would enable some of these management tasks.

I'm going to keep this open for discussion.

ping @bchavez, @danielmewes in case they have input.

I only have knowledge of the C# implementation.

The C# implementation has a GridUtility class with methods that provides low-level ReGrid functions. A few of them are EnumerateFileEntries, EnumerateChunks, GetChunk. These methods give you a way to enumerate soft-deleted files and their associated chunks in the underlying implementation.

You can use the information provided by GridUtility to physically delete those soft-deleted chunks to reclaim space. RethinkDB at the time of this comment doesn't support transactions across multiple documents so you'll need a dedicated thread (or an async process) that can recover from an in-progress hard-delete which AFAIR doesn't require anything fancy. Basically, you'll want to delete all the Chunks first before deleting the FileInfo entry. This async garbage collector would just sit around the database periodically reclaiming space.

You can find more information about deleting here in the C# documentation at the very end of the page:
https://github.com/bchavez/RethinkDb.Driver/wiki/ReGrid-File-Storage#delete

I need to delete files in my project. Here is basically what I'm doing to add "hard delete" functionality without forking regrid.

// Note: always call delete before purge.
bucket.purge = co.wrap(function* purgeId(fileId) {
  // Delete the file chunks.
  yield r.table(bucket.conf.chunkTable)
    .between([fileId, r.minval], [fileId, r.maxval], { index: 'chunk_ix' })
    .delete()
    .run();

  // Delete the file document.
  yield r.table(bucket.conf.fileTable).get(fileId).delete().run();
});

I prefer the C# implementation; it would be nice if we could call bucket.delete(fileId, { mode: 'hard' }).

@internalfx if I make a pull request to add the mode option to delete(), would you review it and potentially merge it in?

I like it.

Send it in 👍