Some nodes' data gets bucketed into "Unknown" disk

Question

Some nodes' data gets bucketed into "Unknown" disk

Opened this issue 6 years ago · 2 comments

 === Distribution across nodes and disks ===

DiskId                   0 0 0 0 0 0 0 0 0 0 1 1
                         0 1 2 3 4 5 6 7 8 9 0 1 Unknown   Count     Average
Host
pc1udatahad15.abacus-u...= = - + = - = = = =     0         4990      499
pc1udatahad12.abacus-u...= = = = = = = = = =     0         6000      600
pc1udatahad07.abacus-u...= = + = = - + - = =     0         3597      359
pc1udatahad16.abacus-u...0 0 0 0 0 0 0 0 0 0     19489     19489     0
pc1udatahad11.abacus-u...= = = = = = = = = =     0         6919      691
pc1udatahad09.abacus-u...0 0 0 0 0 0 0 0 0 0 0 0 9131      9131      0
pc1udatahad17.abacus-u...0 0 0 0 0 0 0 0 0 0     6529      6529      0
pc1udatahad10.abacus-u...= = = + = = = - - = = = 0         8337      694
pc1udatahad13.abacus-u...= = = = = = = = = =     0         5947      594
pc1udatahad14.abacus-u...= = = = = = = = = =     0         6424      642
pc1udatahad08.abacus-u...+ - + - - - = + = +     0         2637      263

Notice nodes 16, 09, 17 show zeros in all disks and actual data gets dumped into "Unknown" column.. anyone knows why that is?

Answer 1 · 2018-04-12T06:48:52.000Z

How you tried restarting them? are they running the same version of HDFS?

Answer 2 · 2018-04-12T15:54:02.000Z

Yes, same version. We run Cloudera distribution of Hadoop, deployed through parcels so they're 100% same version. They work correctly, no issues. This is part of a production cluster, we know they work correctly and can't restart them easily - it has to be scheduled. Thanks.