umarbutler
Assistant Director of Data Science @ Attorney-General, Australia
Attorney-General, AustraliaMelbourne, Australia
Pinned Repositories
emubert-creator
The training code behind EmuBert, the largest open-source masked language model for Australian law.
open-australian-legal-corpus-creator
The code used to create and update the Open Australian Legal Corpus, the first and only multijurisdictional open corpus of Australian legislative and judicial documents.
open-australian-legal-embeddings-creator
The code used to create and update the Open Australian Legal Embeddings, the first open-source embeddings of Australian legislative and judicial documents.
orjsonl
A lightweight, high-performance Python library for parsing jsonl files.
persist-cache
An easy-to-use Python library for lightning-fast persistent function caching.
semchunk
A fast and lightweight pure Python library for splitting text into semantically meaningful chunks.
terge
An easy-to-use Python library for merging PyTorch models.
umarbutler's Repositories
umarbutler/semchunk
A fast and lightweight pure Python library for splitting text into semantically meaningful chunks.
umarbutler/open-australian-legal-corpus-creator
The code used to create and update the Open Australian Legal Corpus, the first and only multijurisdictional open corpus of Australian legislative and judicial documents.
umarbutler/orjsonl
A lightweight, high-performance Python library for parsing jsonl files.
umarbutler/open-australian-legal-embeddings-creator
The code used to create and update the Open Australian Legal Embeddings, the first open-source embeddings of Australian legislative and judicial documents.
umarbutler/terge
An easy-to-use Python library for merging PyTorch models.
umarbutler/persist-cache
An easy-to-use Python library for lightning-fast persistent function caching.
umarbutler/emubert-creator
The training code behind EmuBert, the largest open-source masked language model for Australian law.