There are 3 sources of truth as far as connector metadata right now:
- actual connector source code in the github repo
- https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors
- extracted with
git2json.js
which generatesgit2json.json
which we paste over to this project
- the manually maintained definitions files
- for cloud connectors, the cloud catalog
Our job is to unify them and serve them quickly from a simple RESTful API that is easy to edit, for consumption from docs, UI, and whatever else.
- For most people, this is a read-only endeavor which means we can make use of simple infrastructure.
- For connector maintainers, we need a simple and fast way to update connector metadata for consumption despite not having one source of truth
This means we need a data pipeline from:
- 3 sources of truth (external)
- manually maintained extra info (maintained here)
- combine 1 and 2 into a
results.json
and serve it from this app/site
We have NOT implemented this pipeline yet - we are hoping to agree on the desired output with this MVP and get an owner for this before working backwards on the pipeline.
Endpoints want to offer:
- GET
/connectors
- /connectors - the smallest payload of connectors, minus internal connectors
- listing
['name', 'dateAdded', 'displayName', 'description', 'websiteUrl', 'documentationUrl', 'iconUrl', 'releaseStage', 'connectorType']
- listing
- /connectors?internal - small payload of all connectors including internal connectors
- /connectors?full - just dumping the full
results.json
, which includes extra stuff like docker name, version, sourceDefinitionId, etc
- /connectors - the smallest payload of connectors, minus internal connectors
- GET
/connector/{connector_name}
- return full info for a single connector
- 404 if unrecognized name
in future we maybe want to offer:
- search
- update (for now we choose to just update by updating the results.json file, with possibly a google spreadsheet in the loop)