fsspec/universal_pathlib

Moving files via `UPath.rename` to absolute path locations

falexwolf opened this issue · 5 comments

Because of this:

def rename(
self, target, *, recursive=False, maxdepth=None, **kwargs
): # fixme: non-standard
if not isinstance(target, UPath):
target = self.parent.joinpath(target).resolve()
self.fs.mv(
self.path,
target.path,
recursive=recursive,
maxdepth=maxdepth,
**kwargs,
)
return target

the following happens:

from upath import UPath

path = UPath("s3://lamin-us-west-2/wXDsTYYd/.lamindb/S78k5961PekVzaLrQkGq")
path.rename("s3://lamin-us-west-2/wXDsTYYd/.lamindb/S78k5961PekVzaLrQkGq.csv")
# > S3Path('s3://lamin-us-west-2/wXDsTYYd/.lamindb/s3://lamin-us-west-2/wXDsTYYd/.lamindb/S78k5961PekVzaLrQkGq.csv')

To allow moving files across directories, .rename() should behave as it does in pathlib:
image

An easy fix would be to replace

def rename(
    self, target, *, recursive=False, maxdepth=None, **kwargs
):  # fixme: non-standard
    if not isinstance(target, UPath):
        target = self.parent.joinpath(target).resolve()

with

def rename(
    self, target, *, recursive=False, maxdepth=None, **kwargs
):  # fixme: non-standard
    
    if not isinstance(target, UPath):
        target_upath = UPath(target)
        if target_upath.is_absolute():
            target = target_path
        else:
            target = self.parent.joinpath(target).resolve()

This still clashes with pathlib because pathlib resolves relative to the CWD and not relative to self, but that's another discussion.

Let me know if you accept a PR for this!

Hi @falexwolf

Thank you for reporting the issue. And thanks for offering to contribute! PRs are very welcome.

As you've pointed out, we won't easily manage to be in line with pathlib here, due to the fact that we don't have a concept of cwd for remote filesystems.

Your suggested fix is good, and I would add a check to make sure that the provided target uses the same protocol (this is seemingly broken in the current version too):

def rename(
    self,
    target,
    *,
    recursive=False,
    maxdepth=None,
    **kwargs,
):  # fixme: non-standard

    target_protocol = get_upath_protocol(target)
    if target_protocol and target_protocol != self.protocol:
        raise ValueError(f"expected protocol {self.protocol!r}, got: {target_protocol!r}")
  
    if not isinstance(target, UPath):
        target = UPath(target)

    if not target.is_absolute():
        target = self.parent.joinpath(target).resolve()
    
    ...

If you could add a test for a filesystem that has an empty root_marker as for example "s3" and one for one that has "/" as a root_marker (for example "file") that would be great.

Cheers,
Andreas 😃

Thanks, Andreas! Will give it a shot!

Hi, @ap-- , tried implementing this and observed

`>>> from upath import UPath 
>>> path = UPath("s3://lamin-us-west-2/wXDsTYYd/.lamindb/S78k5961PekVzaLrQkGq") 
>>> path.is_absolute() 
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Users\sergei.rybakov\Apps\Miniconda3\envs\nbproject\lib\pathlib.py", line 1006, in is_absolute return not self._flavour.has_drv or bool(self._drv) AttributeError: 'WrappedFileSystemFlavour' object has no attribute 'has_drv'`

Would you recommend adding custom is_absolute to CloudPath or trying to add has_drv to WrappedFileSystemFlavour?

Apologies, @ap--, that it took us so long to return to here. Also, we decided that @Koncopd would make the PR as he has more in-depth experience with the library than I.

ap-- commented

No worries ☺️

I got back from holiday yesterday and am going through my backlog in the evenings now. I'll provide more feedback later tonight or tomorrow.

And thank you so much for contributing!