gabledata/recap

Replace CatalogPath with URL

Closed this issue ยท 5 comments

I've been thinking about replacing CatalogPath with a URL string. CatalogPaths are:

  1. Too verbose
  2. Not really intuitive to users

Everyone knows what a URL is.

This would be a pretty significant change that would affect the catalog, browser, server, and analyzer APIs.

    def analyze(self, url: str) -> BaseMetadataModel | None:

    def children(self, url: str) -> list[str] | None

    def read(
        self,
        url: str,
        time: datetime | None = None,
    ) -> dict[str, Any] | None:

For the server, I would probably change the paths to look like this:

/catalog/directory?url=postgresql://localhost/some_db
/catalog/metadata?url=postgresql://localhost/some_db/some_table
/catalog/metadata/schema?url=gs://some-bucket/some/file.json
/catalog/metadata/indexes?url=postgresql://localhost@localhost/some_db/some_table
/catalog/search?query=foo

@nahumsa @nehiljain What do you think?

The change would affect the CLI as well.

recap catalog read /databases/postgresql/instances/localhost/schemas/public/tables/requests

Would become

recap catalog read postgresql://localhost:1234/some_db/some_table

I completely agree that the CatalogPath is too verbose, and I think that changing to a URL is the best way to go, since it's way more intuitive, in my opinion. Therefore, I think that this will be a major improvement, even tough it comes with a huge change on many elements of recap.

K, I'm working on this now. The PR is going to be monster; it's fundamentally changing the APIs. That said, I think it's for the best.

Awesome, I will be happy to review the PR when it's ready.

Late to the party here but I agree with URLs are more intuitive than catalogpath.