/arsenal

Collection of various python utilities, recipes, and "gutted" components of other projects.

Primary LanguagePython

The arsenal is an assortment of python utilities for math, natural language processing / information extraction, systems programming, scripting/hacks, software development.

This project is a bit large and slightly messy. Part of this is because I dislike installing software on all of the machine I used (yes, even with tools as awesome as pip, easy_install, and virtualenv). One of my favorite passtimes is finding the "utils.py" of open-source projects and borrowing ideas. If you find any code missing proper attribution please let me know!

There are a lot of files here. I'd like to highlight a few of my most useful / favorite things:

Highlights

  • misc.py

    A "dumping ground" for odd and ends. I normally drop things in here and later move them elsewhere if the turn out to be useful.

  • automain.py

    Automatically constructs a "main function" for any module which calls the automain function.

  • robust.py

    utilities such as timelimit and retry to help "robustify" your code.

  • iterextras.py

    the most useful things here are iterview and sliding_window

  • fsutils.py

    utilities for working with the file system like atomic file writes and recursively listing directories (like UNIX find)

  • nlp/

    Contains many text-processing utilities useful in natural language processing. This directory has gotten big enough to almost become its own project. I've got some pretty good regular expression for scraping dates.

  • debug/

    I'm a big fan of debug.utils.ip and debug.ultraTB2.enable!

    • breakin.py ripped out bzr's infamous breakin feature. enabling this allows the user to send a SIGQUIT or SIGBREAK signal to a running process and get an interactive shell or pdb session AND even resume the process!

    • ultraTB2.py I ripped out ultraTB from IPython (to remove the dependence) and make some of the functionality easier to use.

      It's as simple as:

      >>> from debug import ultraTB2; ultraTB2.enable()
  • cache/

    contains the memoize decorator and even a persistence shelve-based variant.