google/crush-tools

add hash-split option to fieldsplit

Closed this issue · 1 comments

One feature that could be useful:

fieldsplit could have an option to split your data into a fixed number N 
output files based on a hash of the value for a field so that all records 
for a particular value will end up in the same bin.  Even if there are 
many values of field "foo", you can still break up your problem into 
manageable chunks.

Original issue reported on code.google.com by sid...@gmail.com on 14 Aug 2008 at 3:14

Implemented in trunk in r337 and will be in the 2009-01 release.

Original comment by jeremy.h...@gmail.com on 15 Oct 2008 at 5:36

  • Changed state: Fixed
  • Added labels: Type-Enhancement
  • Removed labels: Type-Defect