danielhoadley
R&D and data science in the legal sector. Main focus on data related to litigation.
London
Pinned Repositories
Blackstone
:black_circle: A spaCy pipeline and model for NLP on unstructured legal text.
canliimeta
Collection of scripts used to metadata down from CanLII
edward
A library for probabilistic modeling, inference, and criticism. Deep generative models, variational inference. Runs on TensorFlow.
hdp
Hierarchical Dirichlet processes. Topic models where the data determine the number of topics. This implements Gibbs sampling.
juniper
🍇 Edit and execute code snippets in the browser using Jupyter kernels
lda-c
This is a C implementation of variational EM for latent Dirichlet allocation (LDA), a topic model for text or other discrete data.
Scikit-Learn-Text-Classification-Example-
Sets out a very simple example of using Scikit-learn to run supervised classification over your own corpus of text data
spaCy
💫 Industrial-strength Natural Language Processing (NLP) with Python and Cython
straw
A platform for real-time streaming search
Transparency-Scraper
Scrapes the Daily Wail Online for articles containing specific words and phrases
danielhoadley's Repositories
danielhoadley/Scikit-Learn-Text-Classification-Example-
Sets out a very simple example of using Scikit-learn to run supervised classification over your own corpus of text data
danielhoadley/Transparency-Scraper
Scrapes the Daily Wail Online for articles containing specific words and phrases
danielhoadley/spaCy
💫 Industrial-strength Natural Language Processing (NLP) with Python and Cython
danielhoadley/canliimeta
Collection of scripts used to metadata down from CanLII
danielhoadley/juniper
🍇 Edit and execute code snippets in the browser using Jupyter kernels
danielhoadley/raptor
The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
danielhoadley/Blackstone
:black_circle: A spaCy pipeline and model for NLP on unstructured legal text.
danielhoadley/edward
A library for probabilistic modeling, inference, and criticism. Deep generative models, variational inference. Runs on TensorFlow.
danielhoadley/hdp
Hierarchical Dirichlet processes. Topic models where the data determine the number of topics. This implements Gibbs sampling.
danielhoadley/lda-c
This is a C implementation of variational EM for latent Dirichlet allocation (LDA), a topic model for text or other discrete data.
danielhoadley/straw
A platform for real-time streaming search
danielhoadley/case-spy
Express.js application that uses regular expressions to match against legal case references in a hard-coded paragraph
danielhoadley/caselaw-classifier
Uses scikit-learn to predict the main subject matter of judgments
danielhoadley/convertjs
Converts JSON into CSV
danielhoadley/ctr
Collaborative modeling for recommendation. Implements variational inference for a collaborative topic models. These models recommend items to users based on item content and other users' ratings.
danielhoadley/ds-caselaw-utils
danielhoadley/FCR
Family Court Report processor
danielhoadley/ftl-sprint-case-clustering
Work on clustering court cases (in XML format), done for Jack Cushman's Free the Law Wintersession Sprint in January 2016.
danielhoadley/ftpConnect
Connect to an FTP site with Python
danielhoadley/Garland
Builds inverted index of sentences across multiple files
danielhoadley/home
danielhoadley/ImportantBits
Highlight popularly cited paragraphs on canlii cases
danielhoadley/Law-report-judgment-extractor
Parses XML law reports files and extracts the case metadata and judgment portions of the file
danielhoadley/Law-report-topic-sorter
Parses XML law reports files, identified the zero-level catchword in the markup and moves the file to a folder named according to the zero-level catchword
danielhoadley/minimal-django-file-upload-example
Source code for example at http://stackoverflow.com/questions/5871730/need-a-minimal-django-file-upload-example
danielhoadley/RCJ-Cause-List-Scraper
Scrapes the daily cause list of cases listed in the Royal Courts of Justice published by the Ministry of Justice at https://www.justice.gov.uk/courts/court-lists/list-cause-rcj
danielhoadley/sentenize
A simple Python script that uses NLTK to split documents into sentences
danielhoadley/Smoke
BAILII feed parser
danielhoadley/sumy
Module for automatic summarization of text documents and HTML pages.
danielhoadley/XML-Tag-Remover
Removed XML tags from documents leaving the text intact