Problem with importing columns that are mostly NAs
christianwoe opened this issue ยท 3 comments
Hi all,
๐ Bug
I was trying to import some data from MiXCR 4.3.1 tsv files and recognized warning messages for some of the samples.
After further checking it seems that in rare cases columns are assigned to type logical even if there are cases where character content is present for some of the clones. However, those cases are replaced by 'NA' and therefore the information is discarded.
It looks for me like the readr function in inside repLoad is guessing the wrong type of the column,
probably because it only checks a subset of rows.
It would be helpful to be able to modify the parameter provided to the readr function, either 'col_types' or 'guess_max'. Or is there already another solution?
To Reproduce
Steps to reproduce the behavior:
- repLoad(pathname)
This is the warning message.
Warning: One or more parsing issues, call `problems()` on your data frame for details, e.g.:
dat <- vroom(...)
problems(dat)
Expected behavior
Columns with at least 1 non-NA are not assigned to type logical.
Many thanks and kind regards,
Christian
Thank you for opening the issue. Could you share an example of such data please? What columns are usually the problematic ones?
I'm open to scheduling a short call to discuss this issue over Zoom if this accelerates things.
Hi,
here is an example based on test data where I think the 'allDHitsWithScore' is causing a warning, because only one of all the clonotypes has an assigned value here.
Best wishes,
Christian
Hey everyone, I'm facing a similar issue here - was this fixed in the latest update? Cheers, Nicole