openspending/spendb

Split out metadata taxonomies

Opened this issue · 6 comments

pudo commented

At the moment, spendb contains descriptors for metadata, such as a list of all countries, currencies, types of budgets etc. This should be moved to it's own package so that it can be re-used across spendb and OpenSpending Next.

Refs #8, and How can we structure the datasets?

Proposed names:

  • fiscalmodel
  • spendmodel

(cc @pwalsh)

Excellent. We are planning to have much of this stuff in data package format (https://github.com/openspending/next/issues/10) and/or to build an api around such data. Are you thinking in terms of exposing this as a python lib, ala https://pypi.python.org/pypi/pycountry and https://github.com/tryggvib/economics?

Also, we want to have a structure specifically for declaring governing bodies (as per the note here), but I wouldn't want to enforce data packages/spend data to be government specific. Note sure how to best go about that right now.

pudo commented

Yep, I think a Python package would be perfect. economics has been really messy to test in OpenSpending, so my preference would be to include data and have some Makefile to pull stuff in from where it lives originally (example).

As for government bodies, I wonder if there's something that can be re-used from the IATI Organisation Identifier side of the world. Perhaps @markbrough can tell us how useful that has been in the past.

How about https://github.com/opencivicdata/ocd-division-ids

I was looking into it for use on previous projects.

BTW, that Makefile idea is good, I could use that in other scenarios too.

@pudo, re IATI org IDs for public bodies, this is currently mostly limited to organisations on an OECD DAC code list. IATI also maintains its own code list of organisations but that is mostly just created on demand by a new publisher to have some way to refer to themselves if they're not on the OECD DAC list. Private sector / NGOs are handled differently.

I think a decent way forward would probably be to use the administrative classifications countries use in their own budgets (which at least contain a code for each ministry, sometimes departments within that ministry, and sometimes parastatals). Need to flesh that idea out a bit though...

pudo commented

Ok, I'm starting a repo at https://github.com/pudo/fiscalmodel - will probably move over the contents of spendb/reference/ to get started, then see if there can be a more coherent workflow around data generation.

@pwalsh thanks for the link to opencivicdata - that looks very useful indeed! The Makefile approach comes from d3's Mike Bostock.

@markbrough Good idea, would love to hear more :) I can also imagine mining GADM or NaturalEarth for a rough cut list of second- and third-level administrative units, which might just be good enough for our ends?