skshetry/webdav4

Option to enable trailing slashes

bitzl opened this issue · 8 comments

bitzl commented

The WebDAV server I work with expectets trailing / at the end of directory urls and I can't figure out how to do this with this library (it's a 3rd-party service, so I can't change the server behaviour).

In any case of

client = Client("https://theuser.your-storagebox.de", auth=("theuser", "thepassword"))
client.exists("abc")
client.exists("/abc")
client.exists("abc/")
client.exists("/abc/")

the URL is normalized to https://theuser.your-storagebox.de/abc, which then yields a HTTPError because of 301 redirect to https://theuser.your-storagebox.de/abc/.

Is there a way to manually set an URL with trailing slashes or to enable trailing slashes as default? I'm happy to help if needed.

It looks like client.exists(path) calls client.propfind(path) with default arguments:

self.propfind(path)

And client.propfind(path) has argument follow_redirects with default value False:

follow_redirects: bool = False,

Maybe True should be passed in client.exists?

lighttpd webdav module also needs trailing slashes for directories. For instance client.ls("my/directory/") does not work because it requests <base_url>/my/directory. But ls cannot know in advance if the target is a file or a directory. I think the best thing it can do is keeping the trailing slash if there is one in the argument. Maybe an option like keep_trailing_slash=True would be nice.

Maybe this client should always follow redirects to handle both servers which require trailing slash and which require to trim it.

Hello, I got the same issue.

At the moment I'm forced to dynamically patch (monkey patch) the wrap_fn to force follow_redirects=True.

I'm considering to propose a pull request based on the following.

I think that a valid possibility would be to remove the follow_redirects parameter in the wrap_fn call in order to rely on client options passed to the constructor in (WebdavFilesystem or Client classes). Maybe such technique would also allow to remove all follow_redirects=True in the client.py file

    def propfind(
        self,
        path: str,
        data: str = None,
        headers: "HeaderTypes" = None,
        # follow_redirects: bool = False,   # Remove this line, but API break
    ) -> "MultiStatusResponse":
        """Returns properties of the specific resource by propfind request."""
        call = wrap_fn(
            self._request,
            HTTPMethod.PROPFIND,
            path,
            content=data,
            headers=headers,
            # follow_redirects=follow_redirects
        )
        http_resp = self.with_retry(call)
        return parse_multistatus_response(http_resp)

If we want to not break the propfind API we can also add the follow_redirects in the wrap_fn function call only if it is true.

    def propfind(
        self,
        path: str,
        data: str = None,
        headers: "HeaderTypes" = None,
        follow_redirects: bool = False,
    ) -> "MultiStatusResponse":
        """Returns properties of the specific resource by propfind request."""
        client_opts = {"follow_redirects": True} if follow_redirects else {}
        call = wrap_fn(
            self._request,
            HTTPMethod.PROPFIND,
            path,
            content=data,
            headers=headers,
            **client_opts
        )
        http_resp = self.with_retry(call)
        return parse_multistatus_response(http_resp)

With such change, the WebdavFilesystem (or Client) should be instantiated with the follow_redirects=True in the client_opts like in the following example:

from webdav4.fsspec import WebdavFileSystem
fs2 = WebdavFileSystem(
    "https://my-webdav-repo.com",
    auth=(user, pass),
    follow_redirects=True
)

By doing so, we should also add an option to the CLI in order to add a --follow-redirects flag in order to instantiate properly the Webdav4 Client.

I'm new with webdav4 so maybe I missed something.

Hi, sorry for the late response. follow_redirects is set to False given httpx default. This is done for security reasons of course, and I wanted to keep it.

The issue here is that the standard WebDAV servers expect the PROPFIND call for collections/dirs to be made to URL with a trailing slash, i.e. /dir/ since it's a collection.
And for non-collection resources, the URI is without the slash, i.e. /file. But for APIs like exists(), we cannot be sure whether or not we should add a trailing slash as we don't know if it's a collection or not. A lot of servers these days do use redirection or return identical responses on both. So that saves us from checking both responses.

Path handling definitely needs to improve (#3) in webdav4. I'm okay with setting follow_redirects=True by default for the timebeing.

In my opinion, a clean solution would be to remove the parameter follow_redirects: bool = False in the propfind method as I presented previously. By doing so, the user is responsible for enabling or not the redirection when the WebDavFileSystem is instantiated.

The problem still exists though, if follow_redirects=False, as URI for a collection requires a trailing slash. At the moment, we just strip trailing-slashes. That could be fixed, but we don't always know if the resource is a collection or not (eg: in exists).

The temporary fix here is to get rid of follow_redirects=False from propfind and enable it in client by default.

That could be fixed, but we don't always know if the resource is a collection or not (eg: in exists).

exists returns True for both file and directory, the resource type does not matter. Someone should use isdir/isfile instead