joshsmith2
Senior data scientist at Which? Always interested in talking about NLP, using neural nets to detect scams, and deliberative democracy.
Which?London
Pinned Repositories
youtube-access
A tool to access video descriptions, comments etc from Youtube's Data API (see https://developers.google.com/youtube/v3)
freqtools
Python tools for manipulation and conversion between pitch and frequency.
latinum-bonds
A quick blockchain demo created for Demos at Latitude
move-by-regex
A laser guided system for moving files within structured directories (Projects servers and the like) which correspond to regex strings to a destination. Useful when archiving all jobs with a given number, but differing naming conventions.
playingGod
A project to evolve synthesised sound through natural selection
polis-loadtesting
A series of Locust test to meaningfully load-test Pol.is instances
polisServer
:nut_and_bolt: nuts and bolts of the system
pytorch-processing
Creation of training, test and validation datasets for model training, using Pytorch
qlik_load
Generate automatic Qlik load scripts from CSV files
text
Data loaders and abstractions for text and NLP
joshsmith2's Repositories
joshsmith2/move-by-regex
A laser guided system for moving files within structured directories (Projects servers and the like) which correspond to regex strings to a destination. Useful when archiving all jobs with a given number, but differing naming conventions.
joshsmith2/parallel_rsyncs
Uses gnuparallel and rsync to move data around, hopefully very quickly (Current project - in progress)
joshsmith2/playingGod
A project to evolve synthesised sound through natural selection
joshsmith2/polis-loadtesting
A series of Locust test to meaningfully load-test Pol.is instances
joshsmith2/qlik_load
Generate automatic Qlik load scripts from CSV files
joshsmith2/sanitise-and-move
Sanitise all files in a directory, removing any characters from filenames which are illegal on Windows as well as problematic characters, then move them to another location, logging everything fully. This was written for Hogarth in 2013 as an archiving solution. Usage: Usage: -c, --casesensitive: For use on case sensitive filesystems. Default - off. -d, --dorename: Actually rename the files - otherwise just log and output to standard output. -h, --help: Print this help and exit. -l --logstashDir: A directory on the archive box containing a set of files sent by rsyslog to logstash. -r --renameLogDir: Directory, usually on the destination, for logs of files which have been renamed to be stored. -o, --oversizelog: Log to write files with overlong path names in - otherwise don't log. -p, --passdir: Directory to which clean files should be moved. -q, --quiet: Don't output to standard out. -t, --target: The location of the hot folder --temp-log-file: A file to write log information to
joshsmith2/latinum-bonds
A quick blockchain demo created for Demos at Latitude
joshsmith2/polisServer
:nut_and_bolt: nuts and bolts of the system
joshsmith2/pytorch-processing
Creation of training, test and validation datasets for model training, using Pytorch
joshsmith2/text
Data loaders and abstractions for text and NLP
joshsmith2/bing-search-sdk-for-python
Bing Search APIs SDK for python
joshsmith2/data-loop
Downloads data from a URL, runs it through a pre-built Qlikview structure and sends it via email.
joshsmith2/demos-space
A small static page for Demos Space
joshsmith2/docker-django
A dockerised generic django instance, with mysql and apache
joshsmith2/keras-training
Prepatory work for text classification with Keras and Tensorflow, guided by F. Chollet
joshsmith2/neuralnets_questionmark
Personal: Going through a tutorial to work out whether it's worth using neral nets for the next iteration of PlayingGod
joshsmith2/not-drowning
Visualise and play waves and music samples in browser. We hope.
joshsmith2/pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
joshsmith2/pygmalitron
A neural net learning to generate samples from simple waves
joshsmith2/sheridan-site
Site for Sheridan Tongue.
joshsmith2/soup-scrape
Playing around with JSoup to do some web scraping work
joshsmith2/svg-icons
A storage bin for small, simple SVG icons I've created for various projects. Use away.
joshsmith2/swisspy
Ever heard a collection of functions being described as a 'swiss army knife'? Well, it's a useful metaphor, so there.
joshsmith2/template_unittests
A template for creating projects with unittests
joshsmith2/topic-galaxy
Takes a free-text dataset and a words list, and outputs Gephi-mappable data allowing you to see relationships between those words, and the people using them.
joshsmith2/twitter-image-cloud
Intention: Create what may or may not look like a cloud of images shared on Twitter, sized by the number of times they've been shared
joshsmith2/whatsapp-parser
A lightweight script to convert WhatsApp .txt exports to .csv. NB: Developed for and tested on WhatsApp's 2018 export format
joshsmith2/wikipick
A lightweight Python implementation of https://github.com/earwig/mwparserfromhell to read from Wikipedia
joshsmith2/words-in-tweets
Counts words in Tweets. That's... thats it.
joshsmith2/youtube-enquirer