/trumpest

We know datas. We have the best datas.

Primary LanguageJavaScriptMIT LicenseMIT

Trumpest

The goals of this project is to provide technology to accelerate investigations into Trump corruption. The world needs more tools such as those mentioned in 4 approaches to building collaborative data infrastructures for journalism.

The core novelty of this project is to interconnect data sources on the web, rather than collect them together. The main benefit is the interconnection of public data. A corollary of this approach is that private data can be interconnected with public data.

Action items include:

  • collect best practices for data authors for harmonizing multiple repositories of data related to Trump
  • use CSVW to make the harmonization explicit to software
  • write software which can be used by data authors and researchers to make them more productive

The core of the idea is to use CSVW to create web structures which allows separately developed data repositories (“databases”) to be interlinked over the Web. This essentially creates a web-tech-based distributed “database” of Trump related data.

For example, if two repositories of data have CSVs with columns containing individuals names, then CSVW can be used to map between those two repos so that “joins” can be performed to find individuals on which there is data within both repositories.

The term “database” as used herein explicitly does not mean RDBMS/SQL tech. This is only about how data is expressed on the web, which may very well have a RDBMS behind it - or be based in a Google Doc spreadsheet - but such details are irrelevant.