Error in performance::check_distribution(): in call bw.SJ()
Closed this issue · 2 comments
arodionoff commented
Since spring, using the performance::check_distribution() function gives an error in logistic regression:
Error in bw.SJ(x, method = "ste") : sample is too sparse to find TD
# install.packages(c("smbinning", "randomForest", "performance"))
# Load library and its dataset
library(smbinning)
# Sampling
pop=smbsimdf1 # Population
train=subset(pop,rnd<=0.7) # Training sample
# Generate binning object to generate variables
smbcbs1=smbinning(train,x="cbs1",y="fgood")
smbcbinq=smbinning.factor(train,x="cbinq",y="fgood")
pop=smbinning.gen(pop,smbcbs1,"g1cbs1")
pop=smbinning.factor.gen(pop,smbcbinq,"g1cbinq")
# Resample
train=subset(pop,rnd<=0.7) # Training sample
test=subset(pop,rnd>0.7) # Testing sample
# Run logistic regression
modlogisticsmb=glm(fgood~ .,data = train,family = binomial())
summary(modlogisticsmb)
# Error in performance::check_distribution()
library(performance)
performance::check_distribution(modlogisticsmb)
We has error:
#> Error in bw.SJ(x, method = "ste") :sample is too sparse to find TD
However the same code in the environment works:
> utils::sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] performance_0.10.8 smbinning_0.9 Formula_1.2-5 partykit_1.2-20 mvtnorm_1.2-3 libcoin_1.0-10
[7] sqldf_0.4-11 RSQLite_2.3.2 gsubfn_0.7 proto_1.0.0
loaded via a namespace (and not attached):
[1] rstudioapi_0.15.0 splines_4.2.2 insight_0.19.6 bit_4.0.5 lattice_0.20-45
[6] rlang_1.1.1 fastmap_1.1.1 blob_1.2.4 tcltk_4.2.2 tools_4.2.2
[11] cli_3.6.1 DBI_1.1.3 bayestestR_0.13.1 datawizard_0.9.0 randomForest_4.7-1.1
[16] survival_3.4-0 bit64_4.0.5 inum_1.0-5 Matrix_1.6-1.1 vctrs_0.6.4
[21] rpart_4.1.21 memoise_2.0.1 cachem_1.0.8 compiler_4.2.2 chron_2.3-61
[26] pkgconfig_2.0.3
> performance::check_distribution(modlogisticsmb)
# Distribution of Model Family
Predicted Distribution of Residuals
Distribution Probability
normal 62%
cauchy 34%
poisson (zero-infl.) 3%
Predicted Distribution of Response
Distribution Probability
bernoulli 97%
binomial 3%
arodionoff commented
You can restore the function performance::check_distribution()
by downloading previous old versions of 4 packages:
bayestestR - 0.13.1, datawizard - 0.9.0, insight - 0.19.6, performance - 0.10.8
strengejacke commented
Thanks, should be fixed (and included in #643)