All Design Pattern Algorithm from "Data-Intensive Text Processing with MapReduce" Jimmy Lin and Chris Dyer University of Maryland, College Park. This project is an implementation of the algorithms. WordCountInMapperCombining.java IMCDP -> In Memmory combiner Design Pattern Implementation global IMCDP In the global IMCDP approach, instead of using an associative array per key-value input, we use an associative array per mapper. The global IMCDP approach may run into a memory limitation issue. If the associative array becomes very large and to the point where memory runs out, your mapper task will certainly crash.In this implememtation we flush out the Map regularly. Reducer is the hadoop library class IntSumReducer imported into the java file, no custom reducer required. Word Co-occurence. we focus on the problem of building word co-occurrence matrices from large corpora, a common task in corpus linguistics and statistical natural language processing.This task is quite common in text processing and provides the starting point to many other algorithms, e.g., for computing statistics such as pointwise mutual infor-mation , for unsupervised sense clustering etc. ComputeCooccurrenceMatrixPairs.java The Pairs Design Pattern The mapper processes each input document and emits intermediate key-value pairs with each co-occurring word pair as the key and the integer one (i.e., the count) as the value. This is straightforwardly accomplished by two nested loops: the outer loop iterates over all words (the left element in the pair), and the inner loop iterates over all neighbors of the ?rst word (the right element in the pair). The neighbors of a word can either be de?ned in terms of a sliding window or some other contextual unit such as a sentence. ComputeCooccurrenceMatrixStripes.java The Stripes Design Pattern Like the pairs approach, co-occurring word pairs are generated by two nested loops. However, the major di?erence is that instead of emitting intermediate key-value pairs for each co-occurring word pair, co-occurrence information is First stored in an associative array. The mapper emits key-value pairs with words as keys and corresponding associative arrays as values, where each associative array encodes the co-occurrence counts of the neighbors of a particular word (i.e., its context).