Hadoop ecosystem recipes for common data transformations & iterative algorithms
- Filter
- Sort (done via the shuffle stage)
- Aggregate (count, sum, average, etc.)
- Remap / Rename / Re-order
- Intersect (join)
- Group By (done in reducer, once like keys have been grouped on the same Reducer node via the Shuffle step)