collapsing to unique 'x' values & stat_plsmo() Lowess smoothing
bri2020 opened this issue · 2 comments
Dear all, Dear Frank
I have this data (data.csv) for example
x,y
2,1934
16.3636363636364,1618
5.27272727272727,1701
69.5454545454545,3409
59.7272727272727,3334
71,3471
69.5454545454545,3409
69.5454545454545,3409
59.3636363636364,3264
46.7272727272727,2966
46.7272727272727,2966
46.2727272727273,2915
46.7272727272727,3047
46.7272727272727,3048
46.7272727272727,2966
55.2727272727273,3021
51.4545454545455,3377
51,3283
50,2969
46.7272727272727,2966
and this command
ggplot(data = data, mapping = aes(x = x, y = y)) +
Hmisc::stat_plsmo()
I get this warning:
"Warning: collapsing to unique 'x' values"
and I can see why, as I have repeated values in x.
However I am not sure if I should just ignore this warning, because if I am using a bigger data set I even get another warning on top. "Warning message:
In regularize.values(x, y, ties, missing(ties), na.rm = na.rm) :
collapsing to unique 'x' values"
I would be happy if someone could help "translate" what R is trying to warn me about. (Googling and forum searches have not helped so far).
Many thanks,
Britta
I am having the same issue except I have stripped out all the duplicates and still get a warning.
I am using wad.quantile
inside a mutate. I started out with one data set and it works just fine no warnings. I tried on a second data set that is structurally the same but just has different values and I started to get the collapsing to unique 'x' values
warning.
I tried stripping out all zero values and duplicates in the second data set and got the same warning. Mind you my first data set, the one it works on just fine also has duplicates and zeros and it doesn't give me warnings. The only thing different between these two data sets is the ranges of values and the weights used (wts_int
) but I don't understand why that should matter.
Here is my data and code if you would like to produce the issue.
This is the first data set that works just fine: data_1_reduced.csv
And here is the data that produces the warning, both the regular data set and the version with duplicate values stripped out: data_2_reduced.csv, data_2_reduced_no_dups.csv
And these is the code I am running to try and produce the weighted quantiles:
xlim_max = 0.99
xlim_min = 0.01
processed_data_1 = data_1_reduced %>%
group_by(scn) %>%
mutate(wts = wt_raw/sum(wt_raw),
wts_int = wts*10^14,
quantile_mean = sum(wts*value),
xlim_max = Hmisc:::wtd.quantile(value, weights = wts_int, probs = c(xlim_max)),
xlim_min = Hmisc:::wtd.quantile(value, weights = wts_int, probs = c(xlim_min))) %>%
ungroup()
processed_data_1_no_dups = data_1_reduced_no_dups %>%
group_by(scn) %>%
mutate(wts = wt_raw/sum(wt_raw),
wts_int = wts*10^14,
quantile_mean = sum(wts*value),
xlim_max = Hmisc:::wtd.quantile(value, weights = wts_int, probs = c(xlim_max)),
xlim_min = Hmisc:::wtd.quantile(value, weights = wts_int, probs = c(xlim_min))) %>%
ungroup()
processed_data_2 = data_2_reduced %>%
group_by(scn) %>%
mutate(wts = wt_raw/sum(wt_raw),
wts_int = wts*10^14,
quantile_mean = sum(wts*value),
xlim_max = Hmisc:::wtd.quantile(value, weights = wts_int, probs = c(xlim_max)),
xlim_min = Hmisc:::wtd.quantile(value, weights = wts_int, probs = c(xlim_min))) %>%
ungroup()
(Note the reason I multiple the weights by 10^14 is because someone on my team who was previously using this function told me it doesn't handle weight smaller than zero well. But if that's not true please let me know. I don't think it should make a material impact on the issue at hand).
Without using tidyversion
I get that warning all the time and always ignore it. But I wish someone would find the root cause so I could improve the code.