Example of Lightweight MapReduce in python
- Example from mincemeat example
Execute this script on the server:
python example.py
Run mincemeat.py as a worker on a client:
python mincemeat.py -p changeme [server address]
And the server will print out:
{'a': 2, 'on': 1, 'great': 1, 'Humpty': 3, 'again': 1, 'wall': 1, 'Dumpty': 2, 'men': 1, 'had': 1, 'all': 1, 'together': 1, "King's": 2, 'horses': 1, 'All': 1, "Couldn't": 1, 'fall': 1, 'and': 1, 'the': 2, 'put': 1, 'sat': 1}
Math stats - mathstats.py - Given a text file name on the command-line containing one number per line, print out the sum, count, and standard-deviation of all the numbers in the file. All statistics should be found in one pass through the data.
python mathstats.py small.txt
python mathstats.py medium.txt
python mathstats.py large.txt
Letter Frequency - freq.py - Given a text file name on the command-line containing a text document, print out the number of times each character is used, along with the percent of the total with 2 decimal places. Sort the output so the most seen character is on the bottom.
python freq.py smalldick.txt
python freq.py mobydick.txt
Password cracking - passCrack.py - Given a string of characters on the command line, find what string hashes to it. Passwords are sometimes stored in a hashed form, so if the database is breached, the passwords are not easily usable. For this assignment, assume we have a hash of a password in hex form. Given this hash on the command line, find what password hashes to it. Only the first 5 characters of the hash are checked. Assume passwords are 4 or fewer characters containing only lowercase letters and numbers. Use MapReduce to quickly look through all combinations for a match. Print out the input hash string and the valid passwords which hash to it, if any. Use hashlib md5 hexdigest()and use the first 5 characters.
python passCrack.py d077f
#this might take a good 15 minutes
# expected answer {'found': ['cat', 'gkf9']}
python passCrack.py 0832c
# expected answer {'found': ['cats', 'zb8e']}
I would sping multiple servers when running this command.