Link.resolve has issues
Closed this issue · 6 comments
The functionality of Link.resolve
duplicates to some extent functionality that could be provided by urllib.parse.urljoin
in the standard library. (cf. cldf/cookbook#11).
The important addition is the handling of Path
objects, but
Line 111 in c3a3b23
is an explicit instance check and the
pathlib.Path
is likely to be imported from pathlib2
, which leads to huge confusion when using a pathlib.Path
object instead. Some duck typing solution might be more useful and less confusing.
AttributeError: 'PosixPath' object has no attribute 'endswith'
>>> import pdb
>>> pdb.pm()
> /home/gereon/devel/csvw/src/csvw/metadata.py(102)resolve()
-> if not base.endswith('/'):
(Pdb) l
97 def resolve(self, base):
98 if not base:
99 return self.string
100 if isinstance(base, pathlib.Path):
101 return base / self.string
102 -> if not base.endswith('/'):
103 base += '/'
104 return base + self.string
105
106
107 def link_property():
(Pdb) base
PosixPath('/home/gereon/databases/lexirumah-data/cldf')
(Pdb) isinstance(base, pathlib.Path)
False
(Pdb) isinstance(base, pathlib.PosixPath)
False
(Pdb) type(base)
<class 'pathlib.PosixPath'>
is just very confusing.
Hm. Do you have any suggestions how to best detect a Path
? Maybe we should just handle the isinstance(base, str)
case first and for the remainder just assume it's a Path
?
Duck typing? Asking for forgiveness? I chose .joinpath()
instead of /
because it might be more transparent in case something really strange accidentally becomes base
.
try:
return base.joinpath(self.string)
except AttributeError:
urllib.parse.urljoin(base, self.string)
Ok, joinpath
should be innocent enough to discriminate between strings and paths.
In Python 3.6 there is fspath and os.PathLike
:
Libraries wishing to support path objects and a version of Python prior to Python 3.6 and the existence of os.fspath() can use the idiom of path.fspath() if hasattr(path, "fspath") else path.
On PY2 though only pathlib2
and not pathlib
seems to have __fspath__
Whatever happened to my suggestion to use clean existing URL logic in the form of urllib.parse.urljoin(base, self.string)
? Did that get explicitly rejected for good reason, or did it just get neglected?
Just neglected, I guess; or rather shadowed by the issue of how to detect Path
objects.