Databus Client Python

Quickstart Example

Commands to download the DBpedia Knowledge Graphs generated by Live Fusion. DBpedia Live Fusion publishes two different kinds of KGs:

  1. Open Core Knowledge Graphs under CC-BY-SA license, open with copyleft/share-alike, no registration needed
  2. Industry Knowledge Graphs under BUSL 1.1 license, unrestricted for research and experimentation, commercial license for productive use, free registration needed.

Registration (Access Token)

  1. If you do not have a DBpedia Account yet (Forum/Databus), please register at https://account.dbpedia.org
  2. Login at https://account.dbpedia.org and create your token.
  3. Save the token to a file vault-token.dat.

Docker vs. Python

The databus-python-client comes as docker or python with these patterns. $DOWNLOADTARGET can be any Databus URI including collections OR SPARQL query (or several thereof). Details are documented below.

# Docker
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download $DOWNLOADTARGET --token vault-token.dat
# Python
python3 -m pip install databusclient
databusclient download $DOWNLOADTARGET --token vault-token.dat

Download Live Fusion KG Snapshot (BUSL 1.1, registration needed)

TODO One slogan sentence. More information

docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-kg-snapshot --token vault-token.dat

Download Enriched Knowledge Graphs (BUSL 1.1, registration needed)

DBpedia Wikipedia Extraction Enriched TODO One slogan sentence and link Currently EN DBpedia only.

docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/dbpedia-wikipedia-kg-enriched-snapshot --token vault-token.dat

DBpedia Wikidata Extraction Enriched TODO One slogan sentence and link

docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/dbpedia-wikidata-kg-enriched-snapshot --token vault-token.dat

Download DBpedia Wikipedia Knowledge Graphs (CC-BY-SA, no registration needed)

TODO One slogan sentence and link

docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/dbpedia-wikipedia-kg-snapshot 

Download DBpedia Wikidata Knowledge Graphs (CC-BY-SA, no registration needed)

TODO One slogan sentence and link

docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/dbpedia-wikidata-kg-snapshot 

Docker Image Usage

A docker image is available at dbpedia/databus-python-client. See download section for details.

CLI Usage

Installation

python3 -m pip install databusclient

Running

databusclient --help
Usage: databusclient [OPTIONS] COMMAND [ARGS]...

Options:
  --install-completion [bash|zsh|fish|powershell|pwsh]
                                  Install completion for the specified shell.
  --show-completion [bash|zsh|fish|powershell|pwsh]
                                  Show completion for the specified shell, to
                                  copy it or customize the installation.
  --help                          Show this message and exit.

Commands:
  deploy
  download

Download command

databusclient download --help
Usage: databusclient download [OPTIONS] DATABUSURIS...

Arguments:
  DATABUSURIS...  databus uris to download from https://databus.dbpedia.org,
                  or a query statement that returns databus uris from https://databus.dbpedia.org/sparql
                  to be downloaded [required]

  Download datasets from databus, optionally using vault access if vault
  options are provided.

Options:
  --localdir TEXT  Local databus folder (if not given, databus folder
                   structure is created in current working directory)
  --databus TEXT   Databus URL (if not given, inferred from databusuri, e.g.
                   https://databus.dbpedia.org/sparql)
  --token TEXT     Path to Vault refresh token file
  --authurl TEXT   Keycloak token endpoint URL  [default:
                   https://auth.dbpedia.org/realms/dbpedia/protocol/openid-
                   connect/token]
  --clientid TEXT  Client ID for token exchange  [default: vault-token-
                   exchange]
  --help           Show this message and exit.        Show this message and exit.

Examples of using download command

File: download of a single file

databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01/mappingbased-literals_lang=az.ttl.bz2

Version: download of all files of a specific version

databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01

Artifact: download of all files with latest version of an artifact

databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals

Group: download of all files with lates version of all artifacts of a group

databusclient download https://databus.dbpedia.org/dbpedia/mappings

If no --localdir is provided, the current working directory is used as base directory. The downloaded files will be stored in the working directory in a folder structure according to the databus structure, i.e. ./$ACCOUNT/$GROUP/$ARTIFACT/$VERSION/.

Collection: download of all files within a collection

databusclient download https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2022-12

Query: download of all files returned by a query (sparql endpoint must be provided with --databus)

databusclient download 'PREFIX dcat: <http://www.w3.org/ns/dcat#> SELECT ?x WHERE { ?sub dcat:downloadURL ?x . } LIMIT 10' --databus https://databus.dbpedia.org/sparql

Deploy command

databusclient deploy --help
Usage: databusclient deploy [OPTIONS] DISTRIBUTIONS...

Arguments:
  DISTRIBUTIONS...  distributions in the form of List[URL|CV|fileext|compression|sha256sum:contentlength] where URL is the
                    download URL and CV the key=value pairs (_ separted)
                    content variants of a distribution, fileExt and Compression can be set, if not they are inferred from the path  [required]

Options:
  --version-id TEXT   Target databus version/dataset identifier of the form <h
                      ttps://databus.dbpedia.org/$ACCOUNT/$GROUP/$ARTIFACT/$VE
                      RSION>  [required]
  --title TEXT        Dataset title  [required]
  --abstract TEXT     Dataset abstract max 200 chars  [required]
  --description TEXT  Dataset description  [required]
  --license TEXT      License (see dalicc.net)  [required]
  --apikey TEXT       API key  [required]
  --help              Show this message and exit.

Examples of using deploy command

databusclient deploy --version-id https://databus.dbpedia.org/user1/group1/artifact1/2022-05-18 --title title1 --abstract abstract1 --description description1 --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'  
databusclient deploy --version-id https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18 --title "Client Testing" --abstract "Testing the client...." --description "Testing the client...." --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'  

A few more notes for CLI usage:

  • The content variants can be left out ONLY IF there is just one distribution
    • For complete inferred: Just use the URL with https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml
    • If other parameters are used, you need to leave them empty like https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml||yml|7a751b6dd5eb8d73d97793c3c564c71ab7b565fa4ba619e4a8fd05a6f80ff653:367116

Authentication with vault

For downloading files from the vault, you need to provide a vault token. See getting-the-access-refresh-token for details. You can come back here once you have a vault-token.dat file. To use it, just provide the path to the file with --token /path/to/vault-token.dat.

Example:

databusclient download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-snapshots/fusion/2025-08-23 --token vault-token.dat

If vault authentication is required for downloading a file, the client will use the token. If no vault authentication is required, the token will not be used.

Usage of docker image

A docker image is available at dbpedia/databus-python-client. You can use it like this:

docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01

If using vault authentication, make sure the token file is available in the container, e.g. by placing it in the current working directory.

docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-snapshots/fusion/2025-08-23/fusion_props=all_subjectns=commons-wikimedia-org_vocab=all.ttl.gz --token vault-token.dat

Module Usage

Step 1: Create lists of distributions for the dataset

from databusclient import create_distribution

# create a list
distributions = []

# minimal requirements
# compression and filetype will be inferred from the path
# this will trigger the download of the file to evaluate the shasum and content length
distributions.append(
    create_distribution(url="https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml", cvs={"type": "swagger"})
)

# full parameters
# will just place parameters correctly, nothing will be downloaded or inferred
distributions.append(
    create_distribution(
        url="https://example.org/some/random/file.csv.bz2", 
        cvs={"type": "example", "realfile": "false"}, 
        file_format="csv", 
        compression="bz2", 
        sha256_length_tuple=("7a751b6dd5eb8d73d97793c3c564c71ab7b565fa4ba619e4a8fd05a6f80ff653", 367116)
    )
)

A few notes:

  • The dict for content variants can be empty ONLY IF there is just one distribution
  • There can be no compression if there is no file format

Step 2: Create dataset

from databusclient import create_dataset

# minimal way
dataset = create_dataset(
  version_id="https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18",
  title="Client Testing",
  abstract="Testing the client....",
  description="Testing the client....",
  license_url="http://dalicc.net/licenselibrary/AdaptivePublicLicense10",
  distributions=distributions,
)

# with group metadata
dataset = create_dataset(
  version_id="https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18",
  title="Client Testing",
  abstract="Testing the client....",
  description="Testing the client....",
  license_url="http://dalicc.net/licenselibrary/AdaptivePublicLicense10",
  distributions=distributions,
  group_title="Title of group1",
  group_abstract="Abstract of group1",
  group_description="Description of group1"
)

NOTE: To be used you need to set all group parameters, or it will be ignored

Step 3: Deploy to databus

from databusclient import deploy

# to deploy something you just need the dataset from the previous step and an APIO key
# API key can be found (or generated) at https://$$DATABUS_BASE$$/$$USER$$#settings
deploy(dataset, "mysterious api key")