bluelabsio/records-mover

Should data files in a records directory have a prefix?

Closed this issue · 2 comments

Right now, the spec just says that the data files in a records directory can't start with _ or .. That's useful, but given that most things that load data from, say, S3 support loading from a prefix, it might be helpful for us to put the data files under a consistent prefix (for example, data_ or data/). Records mover hopefully is doing the safest thing when it is loading records directory on its own (loading via manifest or by specifying each file in the manifest, for example), but this can still be helpful when working with records directories OUTSIDE of records mover.

@Brunope what do you think about this? I can imagine this being possible with a simple glob. Is this a thing people have mentioned wanting in your times spent troubleshooting with Data Science?

I have no idea, this isn't something I've heard mentioned.