Pinned Repositories
brozzler
brozzler - distributed browser-based web crawler
galgeek.github.io
sqlite-lsm1-lz4
WIP adding lz4 compression to sqlite3-lsm1 based on https://github.com/thoughtpolice/sqlite4_lsm_lz4
umbra
A queue-controlled browser automation tool for improving web crawl quality
warcprox
WARC writing MITM HTTP/S proxy
brozzler
brozzler - distributed browser-based web crawler
rulesengine
model and front-end for rules for managing wayback playback
rulesengine-client
Python client package for the playback rules engine
firefox-ui-tests
INACTIVE - http://mzl.la/ghe-archive - PLEASE NO LONGER USE THIS REPOSITORY. CODE HAS BEEN MOVED TO: hg.mozilla.org
gecko-dev
Read-only Git mirror of the Mercurial gecko repositories at https://hg.mozilla.org. How to contribute: https://firefox-source-docs.mozilla.org/contributing/contribution_quickref.html
galgeek's Repositories
galgeek/brozzler
brozzler - distributed browser-based web crawler
galgeek/galgeek.github.io
galgeek/sqlite-lsm1-lz4
WIP adding lz4 compression to sqlite3-lsm1 based on https://github.com/thoughtpolice/sqlite4_lsm_lz4
galgeek/warcprox
WARC writing MITM HTTP/S proxy
galgeek/advent2017
solutions for adventofcode.com 2017 puzzles
galgeek/umbra
A queue-controlled browser automation tool for improving web crawl quality
galgeek/ansible
Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy. Avoid writing scripts or custom code to deploy and update your applications — automate in a language that approaches plain English, using SSH, with no agents to install on remote systems. https://docs.ansible.com/ansible/
galgeek/ansible-role-solr
Ansible Role - Apache Solr
galgeek/CDX-Writer
Python script to create CDX index files of WARC data
galgeek/check_rabbitmq
Nagios Plugin for RabbitMQ
galgeek/civicrm-core
CiviCRM (Core Application and Framework)
galgeek/cpython
The Python programming language
galgeek/doublethink
rethinkdb python library
galgeek/heritrix3
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
galgeek/iaux-typescript-wc-template
iaux-typescript-wc-template for webcomponents session
galgeek/openlibrary
One webpage for every book ever published!
galgeek/openlibrary-lite
Mobile first "lite" Open Library
galgeek/outbackcdx
Web archive index server based on RocksDB
galgeek/pyppeteer
Headless chrome/chromium automation library (unofficial port of puppeteer)
galgeek/python-for-data-journalists-ccdc
galgeek/rocksdb
A library that provides an embeddable, persistent key-value store for fast storage.
galgeek/rulesengine
A rules engine for managing playback.
galgeek/rulesengine-client
Python client package for the playback rules engine
galgeek/sqlite-lsm1-lz4-data
galgeek/trough
Trough: Big data, small databases.
galgeek/wayback
IA's public Wayback Machine (moved from SourceForge)
galgeek/waybackprov
utility to fetch provenance information from Internet Archive's Wayback Machine
galgeek/wombat
Wombat.js client-side rewriting library
galgeek/youtube-dl
Command-line program to download videos from YouTube.com and other video sites
galgeek/yt-dlp
A youtube-dl fork with additional features and fixes