quiltdata/quilt

Verify package name to ensure that it can be imported

mjmikulski opened this issue · 2 comments

Situation

When creating a package:

import quilt3
quilt3.config(default_remote_registry='s3://your-bucket')
p = quilt3.Package()
p.push("username/packagename")

The package name can be any string. In particular it may be e.g. fashion-mnist.

Why is it wrong?

I would like to install the package and import it as suggested:

from quilt3.data.username import packagename

but here the packagename must be a correct python identifier. For example, it cannot contain -.

Solutions

Verify the package name during package creation and warn (or raise an exception) if it is not a valid python identifier. One can use the python built-in string method isidentifier:

>>> 'a-b'.isidentifier()
False
>>> 'a_b'.isidentifier()
True

I could create a pull request with such a change. So please share your thoughts about it.

Good catch. Our users have definitely expressed that Python identifiers are too limiting for package names. I would suggest that we deprecate this feature, as there are other bugs in import.py at the moment -- and it will never work for certain Python names (i.e. the Python import syntax will never support arbitrary strings)--so we are better off avoiding the inconsistency of "some packages import some don't." .browse is the programmatic equivalent and does not have the limitations of import. Thoughts?

Sure that it can be limiting. But if it just a warning, it should be fine. Anyway, if loading data by import is going to be deprecated, then it may be indeed pointless to warn.