Error in PermutationTest(ctrl measurement, test measurement, effect_size = effect_size_type. The two arrays do not have the same length.

Question

Error in PermutationTest(ctrl measurement, test measurement, effect_size = effect_size_type. The two arrays do not have the same length.

lijiawei88 opened this issue a year ago · 8 comments

Hello,

I recently upgraded to version 2023.9.12 of dabestr and encountered an issue when trying to compute median differences between two unpaired groups. Here is my code:

two.group.unpaired <- dabest(
data = estimation.stats.data,
x = variable,
y = value,
idx = c(control, comparisons),
paired = FALSE
)
two.group.unpaired.mediandiff = median.diff(two.group.unpaired)

The error I received is:

"Error in PermutationTest(ctrl measurement, test measurement, effect_size = effect_size_type.
The two arrays do not have the same length."

After checking my data and comparing it with the sample data, I noticed that my measurement groups don't have the same length(due to some NA value). My current understanding is that these differences in length between the groups are causing the error. While I was able to use an older version that had different lengths for measurement groups without any issues, this version gave me an error.

While I could consider removing rows with NA values to resolve the mismatch, this approach would result in the loss of a significant portion of my data(attached a sample data photo). I'd prefer to retain as much data as possible.

Questions:

Is my understanding correct that the error is arising due to a mismatch in the array lengths?
Is there a recommended approach or best practice for handling NA values when using this function, so as not to lose valuable data?
Thank you for your assistance.

Answer 1 · 2023-10-27T01:14:18.000Z

Hi,

Which older version did you successfully use?

All versions v0.3 and before removed the NAs before computing effect sizes automatically and silently (except for paired tests where all groups has to have the same Ns).

In your case, do the NaNs they indicate an informative failure? Or is the data simply missing?

If it is the former, consider converting NaNs to zero, and using Cliff's delta as an effect size.

If it is the latter, estimation statistics (and all of statistics) can't salvage low Ns, unfortunately,

Hope this helps!

Answer 2 · 2023-10-30T15:53:29.000Z

Hi,

Which older version did you successfully use?

All versions v0.3 and before removed the NAs before computing effect sizes automatically and silently (except for paired tests where all groups has to have the same Ns).

In your case, do the NaNs they indicate an informative failure? Or is the data simply missing?

If it is the former, consider converting NaNs to zero, and using Cliff's delta as an effect size.

If it is the latter, estimation statistics (and all of statistics) can't salvage low Ns, unfortunately,

Hope this helps!

Hi,

Thanks so much for your quick reply!

I believe there was some ambiguity in my initial question. My primary concern is regarding the usage differences between version 0.3.0 and the latest version of dabestr. With version 0.3.0, I was able to run analyses using different numbers of unpaired measurements. However, this doesn't seem to be possible with the new version.

Here is an example for my question:

I deleted one row from the sample data, which caused the "Control 1" and "Test 1" to have different numbers of samples. This is a sample of my data. And the error I got is also the same as this one.

Could you please provide some clarification on this?

Thank you.

Answer 3 · 2023-10-31T03:49:26.000Z

Hi @lijiawei88 ,

As I noted:

All versions v0.3 and before removed the NAs before computing effect sizes automatically and silently (except for paired tests where all groups has to have the same Ns).

But for the current version, I believe @sunroofgod can answer!

Answer 4 · 2023-10-31T07:17:14.000Z

(Thank you @josesho for your previous feedback and for bringing this to my attention.)

Hi @lijiawei88!

This is an unintended feature of v2023.9.12 of dabestr. It appears to be a bug stemming from the calculation of paired statistical tests on non-paired datasets. This leads to an error being raised indicating a mismatch in sample group sizes in the datasets.

This issue has been addressed and resolved in the latest version that is available on the development branch.

You may update your version of dabestr to the development version via:

devtools::install_github(repo = "ACCLAB/dabestr", ref = "dev")

The following code chunk, as per your example, should be working as well

data(non_proportional_data)

dabest_obj.mean_diff <- load(
  data = non_proportional_data[-2, ],
  x = Group,
  y = Measurement,
  idx = c("Control 1", "Test 1")
) %>%
  mean_diff()

dabest_plot(dabest_obj.mean_diff, TRUE)

Thank you for bringing this issue to our attention. Hope this helps!

Answer 5 · 2023-11-07T21:13:56.000Z

Hi @sunroofgod.

I am also getting the same error when I upgraded to the DABESTR v2023.9.12. In my case, I am trying to compute the effect sizes of unpaired groups using the mean difference of shared control. I do not have NA values but the sample size of control group (N=10) does not match the test groups (each with N=8). In the 2020 version, I was able to do this without any problem. I followed your recommendation of updating the version using devtools::install_github(repo = "ACCLAB/dabestr", ref = "dev") but still getting the same issue.

This is my code I used following the tutorial:

shared_control <- load(data,
x = Group, y = Measurement,
idx = c("Control", "Test1" ,"Test2", "Test3", "Test4", "Test5 )
)

shared_control.mean_diff <- mean_diff(shared_control)

And the error I get: "Error in PermutationTest(ctrl measurement, test measurement, effect_size = effect_size_type.
The two arrays do not have the same length."

Is there a way around this? Thank you in advance.

Answer 6 · 2023-11-07T23:51:41.000Z

(Thank you @josesho for your previous feedback and for bringing this to my attention.)

Hi @lijiawei88!

This is an unintended feature of v2023.9.12 of dabestr. It appears to be a bug stemming from the calculation of paired statistical tests on non-paired datasets. This leads to an error being raised indicating a mismatch in sample group sizes in the datasets.

This issue has been addressed and resolved in the latest version that is available on the development branch.

You may update your version of dabestr to the development version via:
devtools::install_github(repo = "ACCLAB/dabestr", ref = "dev")
The following code chunk, as per your example, should be working as well
data(non_proportional_data)

dabest_obj.mean_diff <- load(
  data = non_proportional_data[-2, ],
  x = Group,
  y = Measurement,
  idx = c("Control 1", "Test 1")
) %>%
  mean_diff()

dabest_plot(dabest_obj.mean_diff, TRUE)
Thank you for bringing this issue to our attention. Hope this helps!

thanks so much for you reply! It works.

Answer 7 · 2023-11-12T15:25:31.000Z

Hi @jeanedlc23!

I've tried replicating the error on my side but everything seems to be working.

More bugs fixes related to mismatch Ns have recently been pushed to the latest version on the development branch, so that might fix whatever error you might be facing.

You might want to try restarting your R session, clearing all objects and then loading the development version of dabestr again via

devtools::install_github(repo = "ACCLAB/dabestr", ref = "dev")
library(dabestr)

and then running the code chunk you've sent

shared_control <- load(data,
x = Group, y = Measurement,
idx = c("Control", "Test1" ,"Test2", "Test3", "Test4", "Test5 )
)

shared_control.mean_diff <- mean_diff(shared_control)

Hope this works!