/gitlaw-au

Ever wondered what it would look like if Commonwealth Legislation was on Github?

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

gitlaw-au

Ever wondered what it would look like if Australian Legislation was available in git / Github?

gitlaw-au is my 2015 #govhack project

I didn't quite make it for GovHack.... oh well!

Browse current acts in Markdown (As of 5 July 2015)

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Status

Text is extracted, but there's still some weird formatting and additional style info, and still missing much of the structure (no table conversion is attempted)

  • Get a list of all current acts and their ComLawID acts_current.txt
  • Get a list of all the RTF/DOC/DOCx versions and volumes of those acts details_current.json
  • Download all the relevant RTF/DOC/DOCx files Amazon S3
  • Extract structure of documents and convert to Markdown (in progress)
  • Read DOCx format and extract indent and font sizes
  • Convert these to markdown indents and heading size
  • Extract table structures
  • Write to markdown using historical git commit based on date legislation came into force
  • Access historical / series of act for history

Files

  • spider.py Crawl legislation by year and get the ComLawID
  • download.py Get the legislation detail form the ComLawID
  • convert.py The actual conversion to Markdown (messy!)