add hash-split option to fieldsplit
Closed this issue · 1 comments
GoogleCodeExporter commented
One feature that could be useful:
fieldsplit could have an option to split your data into a fixed number N
output files based on a hash of the value for a field so that all records
for a particular value will end up in the same bin. Even if there are
many values of field "foo", you can still break up your problem into
manageable chunks.
Original issue reported on code.google.com by sid...@gmail.com
on 14 Aug 2008 at 3:14
GoogleCodeExporter commented
Implemented in trunk in r337 and will be in the 2009-01 release.
Original comment by jeremy.h...@gmail.com
on 15 Oct 2008 at 5:36
- Changed state: Fixed
- Added labels: Type-Enhancement
- Removed labels: Type-Defect