sensidice: a module to aid in creating sensible passphrases using dice. Copyright 2009 Amir Yalon See below for license information. The basic usage of this module is hopefully demonstrated well enough by the doctests (small examples in the code). What it does, roughly, is create digital numeric representations from Python sequences (e.g. lists). One trivial example would be creating a hexadecimal representation: --8<---- import sensidice d = sensidice.Digital([0,1,2,3,4,5,6,7,8,9,'A','B','C','D','E','F']) d(31) -->8---- This will simply yield [1, 'F'], which is similar to 1F, the usual hexadecimal representation of the number 31. The module is called "sensidice" because it is supposed to aid in creating a "sensible passphrase using dice". While one established method <http://diceware.com/> yields just a series of short words chosen out of a list only less than 8k words long, using sensidice with a much larger part-of-speech database and the same dice throws can yield a sensible sentence. IMPORTANT: This text does not qualify as advice on security. YOUR PASSPHRASES ARE YOUR RESPONSIBILITY. Granted, any constructive feedback is great; thanks! To demonstrate, we will use the Part of Speech Database from Kevin's Word List Page <http://wordlist.sourceforge.net/>. We assume that pos/part-of-speech.txt is in our working directory. --8<---- import re import sensidice POS_FILE='pos/part-of-speech.txt' pattern = re.compile(r'^(?P<word>.*)\t(?P<flags>.*)$') posdata = [ { 'pos': 'article', 'flags': set('DI'), 'items': [] }, { 'pos': 'adverb', 'flags': set('v'), 'items': [] }, { 'pos': 'adjective', 'flags': set('A'), 'items': [] }, { 'pos': 'verb', 'flags': set('Vti'), 'items': [] }, { 'pos': 'noun', 'flags': set('NphP'), 'items': [] }, { 'pos': 'other', 'flags': set(), 'items': [] }, ] with open(POS_FILE) as f: for w in [pattern.match(line).groupdict() for line in f]: (word, flags) = (w['word'], w['flags']) for pos in posdata: if pos['flags'] and pos['flags'].intersection(flags): pos['items'].append(word) break digitals = {} for pos in posdata: key = pos['pos'] digitals[key] = sensidice.Digital(pos['items']) "\n".join(['%(n)s %(digital)ss' % {'n': digitals[key].base(), 'digital': key} for key in digitals.keys()]) -->8---- Running this in an interactive Python session, we get a summary of how many words we collected for each part of speech. If we continue with the same session, we can start playing with the items in the digitals dictionary: --8<---- verbal_phrase = digitals['adverb'] * digitals['verb'] + \ digitals['verb'] * digitals['adverb'] import math math.log(verbal_phrase.base(), 2) math.log(verbal_phrase.base(), 6**5) import random verbal_phrase(random.randrange(verbal_phrase.base())) -->8---- As shown above, using only verbal phrases, almost 30 bits of entropy (equivalent to more than 2 throws of 5 dice) can be squeezed into just two words. Actually, some of the entries in the Part of Speech database consist of more than one word, but you get the general idea. Adding nouns: --8<---- noun_phrase = digitals['adjective'] * digitals['noun'] phrase = verbal_phrase * noun_phrase + noun_phrase * verbal_phrase math.log(phrase.base(), 2) math.log(phrase.base(), 6**5) phrase(random.randrange(phrase.base())) -->8---- We show that four words are enough for the entropy of almost 5 throws of 5 dice, and we are still far from exhausting the mudule's possibilities. That is all for now; I hope you find it useful. You are invited to follow the project on GitHub <http://github.com/amiryal/sensidice>. Yours, Amir Yalon <sensidice@amir.eml.cc> -- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.