Functions to handle directories
Opened this issue · 12 comments
Hello!
After working with script for a while (nice work btw!) I miss some basic functions that I eventually implemented (in a quick and dirty way) for myself.
Most of them have to do with directory manipulation, like "list of files in a directory", "list of directories in a directory" as well as "by date" variants which are super useful when we need, for instance, to find the N newest files on a dir?
Would there be interest in PRs with these?
Hi @marcopaganini! Thanks for opening the issue. Could you show an example program here which you wrote using one or more of your proposed functions? It'll be very useful to be able to see them in context.
Hello!
Yes, sure.
Right now, it's not in integrated with the script way of doing things at all. Naturally a PR would have it done better. This was quick hack that I had to do for a simple tool that looks at the last file in a list of files and then looks for a string inside this file (it basically checks if my backups succeeded).
package main
import (
"errors"
"fmt"
"os"
"path/filepath"
"strings"
"github.com/bitfield/script"
)
func backupStatus() error {
var err error
if len(os.Args) != 1 {
return errors.New("use: backup-status")
}
logroot := "/var/log/netbackup"
// The directory structure for logs is:
// /var/log/netbackup / backup-names... / backup-files [.gz]
logdirs, err := listDirs(logroot)
if err != nil {
return err
}
for _, logdir := range logdirs {
logfiles, err := listFilesByDate(filepath.Join(logroot, logdir.Name()))
if err != nil {
return err
}
if len(logfiles) == 0 {
continue
}
latestFile := logfiles[len(logfiles)-1].Name()
latestPath := filepath.Join(logroot, logdir.Name(), latestFile)
var out string
// Gzip decompress on the fly if needed.
if strings.HasSuffix(latestFile, ".gz") {
out, err = script.File(latestPath).Filter(gunzip).Last(10).String()
} else {
out, err = script.File(latestPath).Last(10).String()
}
if err != nil {
return err
}
// Results
if strings.Contains(out, "Backup Result: Success") {
fmt.Println("✅", latestFile)
} else {
fmt.Println("❌", latestFile)
fmt.Println(out)
}
}
return nil
}Great, thanks! Would you now like to try rewriting this program using your proposed script functions, and show how they would work, and how much shorter and clearer it makes the calling code? That'll help us to nail down what specifically is proposed, and also make the case for why it's needed in script.
@marcopaganini just checking if you're still interested in this.
Yes! Sorry just stuck with work. Will get back to this as soon as possible.
OK, found a few minutes with my head out of the water, so I can comment. :)
I've been thinking about which approach would be the most flexible and "pipe compatible". What I did in my original code was to create an ad-hoc function called listFilesByDate, but I don't think that's very much in the spirit of the library in general.
To list files in a directory (without recursing) the library already provides ListFiles, but like I said, I (at least) commonly need to list files by date (ls -tr) or even by size (ls -rS). I think for those uses, a function that reads the Pipe and runs a stat() on every single file and sorts by mtime (ascending or descending) could be useful. To fetch the newest file in a directory, we'd do something like:
f := script.ListFiles("/foo/bar").SortbyTime()
fmt.Println("Oldest file = %s\n", f[0])
fmt.Println("Newest file = %s\n", f[len(f) - 1])Pros:
- This is compatible with
ListFiles. - It would be trivial to implement
ListDirswhich also would be compatible with it. - It should be compatible with FindFiles.
- Most of the code could be used for something like
SortBySizewhich is also useful.
Cons:
- It's somewhat expensive, as it needs to run stat on every single file or directory in the list.
Please let me know what you think.
Regards
This would be a neat thing to be able to do, but to do it properly we really need some concept of structured records rather than plain strings. For example, like Nushell.
If we had that, then you could generate tables of all kinds of data, not just files, and sort and query it by all sorts of attributes, not just time. This is something I've been thinking about for a while and it's such a major API change that it would really have to be a script/v2, or just a differently-named package. But I think very much worth doing.
I didn't know nushell. That's actually quite interesting, thanks.
For script, one option is to assume the input to something like SortBySize to contain files. Of course, it's up to the programmer to make sure that's the case (just like in shell, BTW).
Regarding changes to support structured data, as long as people don't peek inside Pipe, what prevents it from happening now? A V2 shouldn't be needed because it won't change the interface with the user. Basically, ListFiles would populate a structure (or even store a JSON string that can be parsed in multiple ways) inside the structure. Things like SortBySize would be smart enough to read the right metadata and produce the correct output, but "pre-structured" tools could still work as they work today (it would be even possible to fill in the metadata in addition to the existing unstructured data).
A JSON string is a neat idea, since we already have JQ.
One problem with sorting in general is that it doesn't work very well with asynchronous pipelines. In order to sort by anything, a pipe stage has to read its entire input—and that might take arbitrarily long, depending on what is upstream. Not an insuperable obstacle, but just worth noting—we don't even have Sort for strings at the moment.
Why don't you try prototyping this with a custom FilterLine function and see how it cleans up your example program?
Hi, I was looking into making a contribution to script and saw that this issue is still open -- is there any way I can help with this?
Sure @omgoswami, you're most welcome to contribute. I don't know if this is the best issue to start with, since it was vague to start with and even the proposer has apparently lost interest in refining it. But there are plenty of open issues for you to look at, and if you can progress them, or suggest a new one, please do!