WordPress/openverse-api

Wrong swagger specification for the oembed endpoint

chrisbartoloburlo opened this issue ยท 9 comments

Description

The swagger specification describes the endpoint /images/oembed/ as retrieving "embedded content from a specified image URL" whereas in reality it takes image UUID not the url.

Reproduction

  1. One gets a server error when visiting: http://localhost:50280/v1/images/oembed/?format=json&url=https://cdn.stocksnap.io/img-thumbs/960w/91KVXXZQWQ.jpg
  2. Instead, to get the embedded content, one must visit: http://localhost:50280/v1/images/oembed/?format=json&url=5f9ed3ba-b762-459e-9165-094470dc48053
  3. Note that in step 2 we are also referring to the UUID as url instead of identifier.

Resolution

  • ๐Ÿ™‹ I would be interested in resolving this bug.

Which is the correct description? Should the embedded info be retrieved using the UUID or the URL?

Great question @chrisbartoloburlo! It looks like both a URL and an identifier are allowed, but the URL is assumed to be an Openverse URL without a trailing slash. See:

(both links give the same result)

It does seem like the documentation and perhaps the actual implementation around this should be modified though, since http://localhost:50280/v1/images/oembed/?url=http://localhost:50280/v1/images/2f3f8563-a52c-473e-ae0b-aac8c9d81c05/ does not work even though the URL with the trailing slash is what's provided in the detail_url field of the API response. Perhaps we should first rstrip("/") before we rsplit("/") here:

identifier = url.rsplit("/", 1)[1]

Additionally, we should update the documentation to describe that this could be either a URL or an identifier. We could also use better error handling around parsing the URL parameter in general, this seems related to #927. CC @WordPress/openverse-api

without a trailing slash

This will endlessly frustrate me with how many run-arounds we've had enforcing trailing slashes everywhere, which deviates from the behaviour I see on the other APIs I interact with on a regular basis. There are countless PRs in the API and the frontend to do this, without any real reason as far as I can tell and again, deviating from what so far as I can see is the norm. It'd be nice if we could at least be flexible here, if we're already making changes, and accept it either way.

This endpoint is kind of odd anyway in that it doesn't follow any kind of expected RESTful pattern. I wonder why it's not, for example, in this pattern: /v1/images/:uuid/oembed?

We can't get rid of the endpoint as it exists today and we can and should make it more flexible. But I'd vote overall to just make the existing one work as it is today and then move towards the more RESTful and familiar pattern I suggested above (unless there's a specific reason why it needs to be the way it is now) for the "canonical" version of the endpoint that we'd bring forward into any future API versions that iterate on this feature.

I agree with @sarayourfriend, the pattern /v1/images/:uuid/oembed would be more suitable for such a thing. Perhaps, to avoid compatibility issues, we could add the endpoint to /v1/images/:uuid/oembed and also leave the other one there, before removing it completely.

We would not be able to remove it ever from the /v1 prefix, see the discussion in #483 for details about that. If it at least minimally works the way it is now then we can just leave it be but update the documentation on it to recommend using the new and improved endpoint.

Just wanted to clarify that we couldn't ever remove it without breaking our API version contract which shouldn't be done lightly ๐Ÿ™‚

The oEmbed endpoint was never meant to support usage with just identifiers. It was always supposed to take full URLs. But I'm sure whoever wrote it thought it couldn't hurt to additionally support identifiers if the same approach works.

Nonetheless, the documentation for the endpoint clearly says that it accepts URLs, and the associated examples support this. So only that behaviour can reasonably be guaranteed and everything else (functional or otherwise) should be consider undefined behaviour that may break in backwards compatible way.

@sarayourfriend it is my understanding that the oEmbed spec requires that the oEmbed endpoint take a URL as a query param instead so changing the path to /image/:uuid/oembed would be illegal.

@sarayourfriend it is my understanding that the oEmbed spec requires that the oEmbed endpoint take a URL as a query param instead so changing the path to /image/:uuid/oembed would be illegal.

Ah! Had no idea there was a spec/standard we were following here, thanks for pointing that out ๐Ÿ™‚

@dhruvkb was this closed by #934 or was that only related?

This can be closed with the following note:

The oEmbed endpoint takes the URL

  • for an image on Openverse
  • with the image identifier UUID in it
  • ending with a trailing slash

Eg:

https://api.openverse.engineering/v1/images/oembed/?url=https://api.openverse.engineering/v1/images/0aff3595-8168-440b-83ff-7a80b65dea42/

Since this is implemented flexibly, it will also work with some other inputs but those are not standard as per the oEmbed spec, not officially supported and not guaranteed to work.

  • https://api.openverse.engineering/v1/images/oembed/?url=0aff3595-8168-440b-83ff-7a80b65dea42
  • https://api.openverse.engineering/v1/images/oembed/?url=https://any.path/works/here/0aff3595-8168-440b-83ff-7a80b65dea42/