Ranking motifs based on how statistically underrepresented they are

Class project for BME205.

Probility of K-mers modeled with Markovian(2). The program reads FASTA files from stdin and output to stdout.

Example output:

sequence: reverse     count      Expect Zscore
AAAATTTA:TAAATTTT	835	1737.66	-21.66
GATTAATA:TATTAATC	550	1326.89	-21.33
AATTAATA:TATTAATT	929	1839.72	-21.24
AATTAATC:GATTAATT	977	1861.79	-20.51
ATAATTAA:TTAATTAT	1033	1926.49	-20.36
GAAATTTA:TAAATTTC	378	1031.07	-20.34
CCGATCGC:GCGATCGG	1323	2284.74	-20.12
GCGATCGA:TCGATCGC	1221	2121.78	-19.56
ACGATCGC:GCGATCGT	1188	2035.31	-18.78
GTTTAAAA:TTTTAAAC	519	1151.17	-18.63
AAATATTA:TAATATTT	738	1444.79	-18.60
AAAATTTG:CAAATTTT	857	1591.95	-18.42
CATTAATC:GATTAATG	479	1037.49	-17.34
ATATATAA:TTATATAT	271	736.19	-17.15
ATTTAAAG:CTTTAAAT	419	941.16	-17.02
AAATATTG:CAATATTT	687	1287.74	-16.74
CAAATTTA:TAAATTTG	670	1265.24	-16.74
CTTTAAAC:GTTTAAAG	468	995.13	-16.71