
Geared towards keeping up with elusive pricing data in the healthcare industry and developing a data format standard

MedHack Hospital Price Spider

Making public hospital pricing data actually machine-readable and uniform, because the hospitals are developing all kinds of tricks to complicate, evade and mislead around the topic of pricing.

To run this repo in node js see README.md in ./nodejsModule folder

Since hospitals do not release their pricing data in a standardized format (but at least release it now in the US by law), this repo seeks to provide universal conversion functions, as well as raw data of hospital pricing spreadsheets/Word Docs/other formats that contain medical pricing records. The reason this repo contains the data is because pricing data changes (URL locations, content, format, as well as availability).

This is starting with US data; however, we plan to incorporate pricing for all countries eventually, and as soon as possible. America just happens to have one of the worst systems, so we're starting there.

If we do a good job, it will be easier to hold the medical industry accountable and introduce interesting new tools like swaps for consumers that could VASTLY lower healthcare costs. This is the first step.

The format we are using at MedHack for procedures, medications, and devices look like this:

	"itemName": "Whole Body MRI Scan",
	"hospitalId": 2,
	"price": 8229.00,
	"avgPrice": 8229.00,
	"type": "procedure",
	"medianPrice": 9000,
	"outpatientAvgPrice": 9200.00,
	"latestPriceDate": "2019-01-31",
	"firstPriceDate": "2019-01-01",
	"changeSinceLastUpdate": 0.23,
	"description": "...",
	"relatedItemsFromOthers": [10,15],
	"relatedItemsFromThisLocation": [3,4],
	"itemsRequiredForThis": [45, 72],
	"keywords": ["mri", "scan", "niobium"],


Required feilds for all are itemName, hospitalId, currency and price.

For Hospitals/Health Care institutions, example:

	"rId": 2,
	"hospitalName": "Massachusetts General Hospital",
	"ownedBy":"HCABC Example Corp",
	"managedBy": "HCABC Example Corp",
	"keyShareholdersAndPeople":[{"name": "John Doe", "title":"CEO"}],
	"grossRevenueFiscal": 93000039300,
	"annualReportDocs": ["url1", "url2"],
	"currentPricingUrl": "https://msgexampleweb.org/somelocation/xyz.xsl",
	"itemColumnName": "Description",
	"avgPriceColumnName":"Avg Price",
	"priceSampleSizeColumnName":"Sample Size",
	"medianPricingColumnName":"Median Price",
	"outPatientPriceColumnName": "Outpatient Pricing",
	"inpatientPriceColumnName":"Inpatient Pricing",
	"longitude": -70.3323,
	"latitude": 45.0003,
	"founded": 1930,


The others are "nice-to-haves" that will make applications built on top of this much easier and will expose inconsistencies that make the medical industry what it is. The in surance companies have this data; however, they will never release it, so it's up to us. Also, this README should probably be rewritten to sound less angry at the medical industry.... Which leads us to contributions.

Folder structure

../browserJS testing stuff

../nodejsModule contains nodejs app(s) to convert file formats and output that data via an api endpoint for others to consume see the README.md in this folder

. ./proposals project proposals in readme.md

. ./rawCSVs contains .csv files to process

. ./rawXlsxs contains .xlsx (spreadsheets) files to process

. ./SQLs contains .sql files to process

Testing endpoints during development

Refer to README.md in ./nodejsFolder or wiki documentation here


We welcome pull requests, issues, and other contributions. This README could use a lot of work, as well as the converter code to get to our shared goal of making the world a better place. Please send pull requests to the develop branch.

