AirbyteConnectorMissingCatalogError while trying to read source-azure-blob-storage
Closed this issue · 2 comments
IronJayx commented
Hi,
I am getting a catalag error while trying to connect/ read from Azure storage.
Any idea on why ?
All help would be much appreciated, thanks !
Here is my code:
import os
import airbyte as ab
from dotenv import load_dotenv
load_dotenv()
# Configure and read from the source
read_result = ab.get_source(
"source-azure-blob-storage",
config={
"authentification": ["airbytehq/pyAirbyte"],
"credentials": {
"auth_type": "storage_account_key",
"azure_blob_storage_account_key": os.environ.get('AZURE_STORAGE_ACCOUNT_KEY')
},
"azure_blob_storage_account_name": os.environ.get('AZURE_STORAGE_ACCOUNT_NAME'),
"azure_blob_storage_container_name": os.environ.get('AZURE_STORAGE_CONTAINER_NAME')
},
).read()
print(read_result)
Here is the error:
Traceback (most recent call last):
File "/root/airvector/examples/airbyte_test.py", line 20, in <module>
).read()
File "/root/.cache/pypoetry/virtualenvs/airvector-JnIzwmfs-py3.10/lib/python3.10/site-packages/airbyte/sources/base.py", line 708, in read
available_streams=self.get_available_streams(),
File "/root/.cache/pypoetry/virtualenvs/airvector-JnIzwmfs-py3.10/lib/python3.10/site-packages/airbyte/sources/base.py", line 222, in get_available_streams
return [s.name for s in self.discovered_catalog.streams]
File "/root/.cache/pypoetry/virtualenvs/airvector-JnIzwmfs-py3.10/lib/python3.10/site-packages/airbyte/sources/base.py", line 320, in discovered_catalog
self._discovered_catalog = self._discover()
File "/root/.cache/pypoetry/virtualenvs/airvector-JnIzwmfs-py3.10/lib/python3.10/site-packages/airbyte/sources/base.py", line 186, in _discover
raise exc.AirbyteConnectorMissingCatalogError(
airbyte.exceptions.AirbyteConnectorMissingCatalogError: AirbyteConnectorMissingCatalogError: Connector did not return a catalog.
Log output:
Error starting the sync. This could be due to an invalid configuration or catalog. Please contact Support for assistance.
aaronsteers commented
@IronJayx - At first I thought this was due the connector being in Java (it's not). Looking more closely, I think what is happening is that the connector is not able to discover any streams. I believe you need to also specify a streams
collection in the config.
https://docs.airbyte.com/integrations/sources/azure-blob-storage#reference
IronJayx commented
Thanks @aaronsteers that was correct ! The .check() works now.
Now I want to disable parser to process unstructured files (images/ videos) but I guess that is for another issue.
import os
import airbyte as ab
from dotenv import load_dotenv
load_dotenv()
# Configure and read from the source
source = ab.get_source(
"source-azure-blob-storage",
install_if_missing=True,
config={
# "authentification": ["airbytehq/pyAirbyte"],
"credentials": {
"auth_type": "storage_account_key",
"azure_blob_storage_account_key": os.environ.get('AZURE_STORAGE_ACCOUNT_KEY')
},
"azure_blob_storage_account_name": os.environ.get('AZURE_STORAGE_ACCOUNT_NAME'),
"azure_blob_storage_container_name": os.environ.get('AZURE_STORAGE_CONTAINER_NAME'),
"streams": [{
"name": "all",
"format": {
"filetype": "unstructured",
"skip_unprocessable_files": False,
},
"globs": ["**"]
}]
},
)
source.check()