opensdmx/rsdmx

Worldbank API SDMX format version issue

brunocrinon opened this issue · 7 comments

Could you help please on following issue
I get Error in .subset(x, j) : invalid subscript type 'list', if I run following code

URL <- "http://api.worldbank.org/v2/data/WDI/US.TT_PRI_MRCH_XD_WD?startPeriod=2000&endPeriod=2010"
destfile <- "C:/Temp/My_data.xml"
download.file(URL, destfile)
my_data <- readSDMX(file = destfile,isURL=FALSE)
df <- as.data.frame(my_data)

Thanks in advance for your help!

I've tested and checked carefully, and it's not a bug of rsdmx, but rather is an issue of SDMX document validity.

The document http://api.worldbank.org/v2/data/WDI/US.TT_PRI_MRCH_XD_WD?startPeriod=2000&endPeriod=2010 species it is using the SDMX format 2.1 (namespace = http://www.SDMX.org/resources/SDMXML/schemas/v2_1/generic), while if we look at the XML document it uses the SDMX 2.0 format, e.g.:

  • tag generic:Value with attribute concept (from v 2.0) vs. attribute id (from v 2.1)
  • tag generic:Time (from v 2.0) vs. generic:ObsDimension (from v 2.1)

rsdmx is inheriting the schema version from the namespace. Here v2_1 and consequently applies v 2.1 schema parsing, and not 2.0 as it should be, hence failing.

The Service maintenainer should fix the namespaces specified in their SDMX-ML documents:

  • http://www.SDMX.org/resources/SDMXML/schemas/v2_0/generic instead of http://www.SDMX.org/resources/SDMXML/schemas/v2_1/generic
  • http://www.SDMX.org/resources/SDMXML/schemas/v2_0/message instead of http://www.SDMX.org/resources/SDMXML/schemas/v2_1/message

To check this, and as temporary solution for you, you can edit the XML file you have downloaded, replace v2_1 by v2_0 and you will see it will work.

Let me know

FYI, I've contacted the WorldBank API help desk and sent them the below message:

Subject: SDMX API - SDMX-ML format version / namespace issue

Dear Support team,

I'm the maintainer of the R rsdmx library which allows to query SDMX APIs and read SDMX-ML data through R. Some user reported an issue (https://github.com/opensdmx/rsdmx/issues/153) on reading SDMX-ML data from your SDMX endpoint.

After investigation, the conclusion was that the SDMX-ML file retrieved from your API was specifying SDMX namespaces pointing to the SDMX-ML version 2.1 format, while the XML format actually used is v2.0.
Some of the format discrepancies are highlighted in the ticket mentioned above.
Could you fix the namespaces references to point to the v.2.0 schemas? or modify the actual format used to be in line with v2.1?

Looking forward to your feedback,

Best regards,
Emmanuel Blondel

I got feedback from WorldBank SDMX team. Apparently the endpoint is not the current one http://api.worldbank.org/v2/sdmx/rest I've added the latter as embedded service endpoint - see #156