A microservice with API to store, manage and search through meta data of data sets.
The latest documentation can always be found at https://amsterdam.github.io/dcatd/.
The API spec can be found at /openapi on a running instance with default settings. The API of the current running instance at Amsterdam can be browsed using this Swagger UI.
Requires Python 3.6.1 or above
Default configuration uses a PostgreSQL database which can be spun up in a container: Requires Docker and a free port 5433 (this is deliberately another port than PG's default one, to preempt a collision)
This means you can also change the port to 5432 (and other connection parameters) in /examples/running/config.yml and use a locally running instance of PostgreSQL
Create a virtual environment and install all the dependencies:
make alldeps
docker-compose up -d database
make example
See: http://localhost:8000/openapi and http://localhost:8000/datasets
You could even go one further and also spin up the Amsterdam Swagger UI:
docker-compose up -d database swaggerui
make example
Apart from the urls mentioned above you also can access http://localhost:8686/swagger-ui/?url=http://localhost:8000/openapi
docker-compose up -d database
make test
or
docker-compose up -d database
make cov
docker-compose up -d
That's it.
See: http://localhost:8001/openapi , http://localhost:8001/datasets and http://localhost:8686/swagger-ui/?url=http://localhost:8001/openapi
(Example server in docker is accessable through port 8001, while the locally running example runs on port 8000)
You can import CKAN data into the DCAT-API to bootstrap your install with data
PUT-ting and DELETE-ing data via the API require authorisation.
In the context of Amsterdam Data en Informatie can obtiain a JWT from the swagger-ui (http://localhost:8686/swagger-ui/?url=http://localhost:8000/openapi); export it to JWT: (For more information see: https://hub.docker.com/r/amsterdam/oauth2swaggerui/ )
export JWT='<JWT>'
Define your local API, and the source CKAN (point to the root of the API of CKAN):
export DCATD='http://localhost:8000/' # or :8001 , see above
export CKAN='https://demo.ckan.org/api' # or for instance https://api.data.amsterdam.nl/catalogus/api
Then use the scripts in the utils directory to import data
Remark: this is highly localized for the Amsterdam CKAN instance and will fail beyond the first step when using CKAN demo data.
Currently this also will fail on the resources2distributions step, but you will end up with at least a somewhat filled database
cd utils
python dumpckan.py "${CKAN}"
python ckan2dcat.py "${DCATD}"
python resources2distributions.py "${DCATD}files" "${JWT}"
for d in dcatdata/*.json; do
b=`basename "${d}" '.json'`
echo -n "${b}..."
STATUS=$(
curl --header "Authorization: Bearer ${JWT}" \
--header "If-None-Match: *" --upload-file "${d}" \
--silent --output /dev/stderr --write-out "%{http_code}" \
"${DCATD}datasets/${b}"
)
[ "$STATUS" -eq 201 ] && echo "OK" && rm "${d}" || echo "FAILED: $STATUS"
done
If you need to load acceptation data in development you can import this with :
docker-compose exec database update-db.sh dcatd
Requires Sphinx plus extras:
pip install -e .[docs]
Run the following command to push latest version to github:
make -C sphinx gh-pages
python get_invalid_links.py --make_unavailable=yes
With the get_invalid_links.py script is is possible to check if the URLs uses in dataset resource are valid links or whether they do not exist (anymore).
With the parameter --make_unavailable=yes datasets that contain resources with invalid links are set to 'Niet beschikbaar'