Variance Issue?
Closed this issue · 1 comments
RockStone commented
Hi Matt,
I was using exactly the same example as in README to compute the variance, except I cannot use input
as my variable name (otherwise I get the mismatched input 'input' expecting EOF
error), but I've got 6.66666...
as my result. Would you help explain where I got it wrong? Appreciate it!
The version I'm using is:
Apache Pig version 0.10.1.4.1304150518 (rexported)
And here's the code I used:
register datafu-0.0.8.jar;
define VAR datafu.pig.stats.VAR();
-- input: 1,2,3,4,5,6,7,8,9
a = LOAD 'input' AS (val:int);
grouped = GROUP a ALL;
-- produces variance of 7.5
variance = FOREACH grouped GENERATE VAR(a.val);
dump variance;
-- (6.666666666666668)
evionkim commented
Read me file was wrong. 6.67 is correct variance.
(7.5 is estimated variance based on sample)