/data

Open Data Sources

MIT LicenseMIT

Open Data Sources

  • Availability and access: the data must be available as a whole and at no more than a reasonable reproduction cost, preferably by downloading over the internet. The data must also be available in a convenient and modifiable form.
  • Reuse and redistribution: the data must be provided under terms that permit reuse and redistribution including the intermixing with other datasets. The data must be machine-readable.
  • Universal participation: everyone must be able to use, reuse and redistribute — there should be no discrimination against fields of endeavour or against persons or groups. For example, ‘non-commercial’ restrictions that would prevent ‘commercial’ use, or restrictions of use for certain purposes (e.g. only in education), are not allowed.

-- Definition by the Open Knowledge Foundation

Lists of Data Sets

Open Data

  • List of Public Datasets - user-curated
  • DBpedia - utilizing a large multi-domain ontology
  • Public Data Sets on AWS - common web crawl corpus, NASA satellite imagery, Human Genome, Google Book NGrams, Wikipedia Traffic, Million Song Dataset, Federal Reserve Economic Data, PubChem, more.

Private Opened Data

  • New York Times - vocabulary as linked open data; linked vocabulary of people, places, companies, etc.

Governmental Data

Compendium of Governmental Open Data Sources

Non-Governmental Org Data

Academic Data

Inter-university Consortium for Political and Social Research Data Portal

Truly Random Data

Open Data Resources

^ license is not truly open, involves some limitations