jeff-zucker/solid-file-client

Add traverse function

Otto-AA opened this issue · 3 comments

Description

I think that we will need to iterate recursively over all items of a folder in multiple methods, hence a common traverse method for this seems adequate. It could be called on a folder, and then execute a callback for each item in it. Depending on the settings/implementation it uses a different callback order (parents before contents / contents before parent) and depth.

Use cases

  • Recursive Deletion
  • Recursive copy
  • Recursive zipping
  • Recursive search/listing

Features / options

I think the hardest part about the implementation, is to decide what should be supported. Important features are in my opinion:

  • Traverse in Pre-order (callback with parent before callback with contents; useful for copy)
  • Traverse in Post-order (callback with contents before callback with parent; useful for delete)
  • max-depth option (don't go deeper than n folders)
  • manual stepping / possibility to cancel traversal (e.g. for stopping the search when the first element is found)

Implementation

Here is a simplistic version of how it could be implemented and used (in pseudo-javascript):

traversePreOrder (folderUrl, folderCallback, options = { depth = Infinity, withLinks, ...  }) {
  const items = await getFolderItems(folderUrl)
  try {
    await folderCallback(folderUrl, items)
  } catch (Error e) { if e instanceof CancellationException() return; else throw }
  if (depth > 0) {
    return await all (
      for (folder in items)
        traversePreOrder(folder, folderCallback, { depth - 1, withLinks, ... })
    )
  }
}
traversePostOrder (folderUrl, folderCallback, options = { depth = Infinity, withLinks, ...  }) {
  const items = await getFolderItems(folderUrl)
  if (depth > 0) {
    await all (
      for (folder in items)
        traversePostOrder(folder, folderCallback, { depth - 1, withLinks, ...})
    )
  }
  return folderCallback(folderUrl, items, processContents)
}

This could be used like this:

// Recursively delete folder at url
await traversePostOrder(url, (folderUrl, items) => await (delete files in items))

// Recursively copy folder from src to dest
await createFolder(dest)
await traversePreOrder(src,
  (folderUrl, items) => {
    await createFolder(folderUrl)
    await all (for (file in items) copyFile(file, fileUrl, newUrl))
  })

// Recursively list folder contents
const contents = []
await traversePreOrder(url, (folderUrl, items) => contents.push(items))

// Recursively zip folder
let zip = new ZIP()
await traversePreOrder(url,
  (folderUrl, items) => {
    zip.addFolder(folderUrl)
    for (file in items)
      zip.addFile(await getContents(file))
  })

// Search for file which starts with prefix
let res = null
await traversePreOrder(url,
  (folderUrl, items) => {
    for (file in items) {
      if (file.name.startsWith(prefix))
        res = file
      }
    }
    if (res !== null) throw new CancellationException('Found item')
  })

The reason for this suggestion is mainly, that #140 and similar features would be easier to add once solid-file-client supports a common traversal method

@CxRes - have you seen this? How do you think it might relate to your unified functions PR?

CxRes commented

In principle, this is a nice idea. In practice, it is a little more tricky. As I have noted elsewhere, one of my design goals is to minimize the number of times we hit the network, even if that means a little more code.

For the unified copy function, I start with a single GET call. If I get a file, I just write that data to destination; If I get a folder I parse it and kick-off the recursion. Now with a traverse function, I would need to either waste my GET data if I get a folder instead of a file or place a HEAD call to check followed by GET on a file / traverse on Folder. An extra network call either way! There might be similar issue for recursive zipping, I think...
(One way to get around this would be to have private traverse function that is kicked off from a GET response. The public version takes a url. I am open to better ideas).

For remove this is not a problem, as I anyway have the overhead of a HEAD call to determine if resource is a Resource or Container. In this case a traverse is useful.

Another thing for the design, there should be one function which has a pre and a post callback, I cannot think of a scenario right now but, it seems to be more general!

More generally, I was thinking about requesting a file walker, and a walker needs traversal!