climate-strike/license

Feature request: Disambiguation of names?

a3nm opened this issue · 2 comments

a3nm commented

Hi,

I'm a bit puzzled to see companies listed only by name, wouldn't there be some kind of ambiguity about which is meant? Shouldn't the list be augmented with precise identifiers, e.g., tax registration identifiers in some countries, or a Wikidata identifier?

this is a great point, we are still sorting out how best to identify companies. For now we are using the company's listed name on major stock exchanges, though of course this has shortcomings for companies not listed on these exchanges. In your experience how good is the Wikidata coverage? I feel something like tax registration identifier would be preferable since it aligns with public records but the variation across countries might make it too complicated.

a3nm commented

Hi, you can expect that Wikidata will include all companies that are notable enough to have a Wikipedia page. But entities could be created for the other companies, and information like tax registration identifiers could be added to the Wikidata entities. To avoid doing the mapping by hand, it may be possible to align your list using something like OpenRefine (this is about refining data, not oil ;)) and its Wikidata plugin. There is detailed documentation and a video tutorial here.

The reason why I think this disambiguation point is relevant is because it would make it possible to use your list for other applications related to open data about companies with a harmful effect on climate.