/public_liquor_data

Liquor data (bar codes, descriptions, volume, pricing) from public records w/ scripts to parse & massage

Primary LanguagePythonMIT LicenseMIT

Public Liquor Data

In order to build tools that work with bar codes for liquor bottles, I've tried to collect and parse publicly available data. These data dumps were received under public records requests or publicly available from US states' various liquor control bodies.

All the scripts are licensed under the MIT license, and the data is made available under the public record laws of the relevant state.

Data sources

These are records received from OLCC.PublicRecords@oregon.gov.

The "GTIN List With Price" is a slightly awkward Excel format, but parseable with xlrd and some logic, see oregon_gtin_list_to_json.py.

These are public records published on the following URLs:

The "Product List" Excel spreadsheet is excellent and easily parseable (with pandas in this case, see utah_product_list_to_csv.py), but does not contain any bar code information.

The "Numeric Price List" on the other hand has a mapping from Utah's "CSC" product codes to bar codes, but is only available as a PDF. While somewhat painful to parse, camelot provides a lot of help.

This is a dump received from publicrecords@lcb.wa.gov, containing the latest data before the state stopped running their own liquor stores. Unclear if this will be useful, so it has not been parsed yet.