vallotlab/ChromSCape

"raw_mat" not defined

SebastienLemaireCurie opened this issue · 18 comments

When loading fragment data, after it seemed to have put data in memory, ChromSCape stop with a error saying that the "raw_mat" variable is unknown.
I would precise that I am on a newly installed R v4.2.

I gave a look at the code and I found this chunk containing the definition of "raw_mat" (line ~539):
"
if(original_bin_size < 300 & input$rebin_matrices == TRUE){
print("Saving raw matrix as average bin size is lesser than 300bp, for later use (coverage)...")
raw_mat = datamatrix
}
"

It seems lacking initialization of "raw_mat" variable. Also, I am surprised of this condition "original_bin_size < 300" while I asked in the software a 50kbp binning.

For now, I solved the problem by adding the "else" part with "raw_mat = datamatrix".

Kind regards,

Hello Sebastien,

Thank you for posting this issue,

I am not able to reproduce your error on R version 4.2.1.

'raw_mat' is supposed to be initialized l.424

ChromSCape/inst/server.R

Lines 422 to 424 in 01f7572

observeEvent(input$create_analysis, { # save new dataset
req(input$new_analysis_name, input$annotation)
raw_mat = NULL

If you update ChromScape with the latest changes (devtools::install_github("vallotlab/ChromSCape"), do you still get this behavior ?

If yes, could you copy the entire logs just to be sure the problem comes from there ?

Thanks,
Pacome

Hi Pacomito,

Indeed, I did not have these lines in my previous installation (from Bioconductor).
I tried with a new installation from the github version directly as suggested and It worked. Thank you for the fixing.

It may be worth to check the "server.R" script in Bioconductor.

Sorry for the bother,

Kind regards.

Thanks you very much for your feedback,
Indeed the version of Bioconductor might have the bug, I will correct it,

Thanks ,
Pacome

I think I may be running into the same problem with count data input-- files seem to load and output directories are created, but everything grayed-out in browser screen after that and is unresponsive... And I think I saw a message along the lines of 'raw data not found' in the linux background.
Was this issue fixed in version currently on bioconductR?
Is it definitely fixed for count matrix input in the current github version?

Thanks,
Dan

Hello Dan,

Thank you for posting this issue,
If you try to install the latest github version (devtools::install_github("vallotlab/ChromSCape")), do you still get this error ?

If not, is your input matrix a "Dense" or "Sparse" matrix (10X-like format) ? What features was it counted on (e.g. bins of what size or peaks of what average size or else ) ?

Best,
Pacome

Thanks very much for your prompt reply Pacome.
The version I'm working with was installed from Bioconductor.
The input matrices I loaded were two from the example datasets: HBC_22.tsv & HBC_22_TamR.tsv.

I also unzipped some of the Buenorosto bed files and input them as SC bed. They also seemed to read in successfully. As with above datasets, when I hit the Create Analysis button, there was a short pause and then the interface totally grayed-out and in the linux background there was the error message about raw_mat.

The package is installed on a computing cluster here and before asking that support team to re-install the package, it would be good if you are able to verify that the github version has some code difference(s) that should get around this problem?

Thanks,
Dan

Screen shots of behavior with 100 unzipped bed files divided into 2 samples of 50.
Screen Shot 2022-12-02 at 12 03 20 PM
Screen Shot 2022-12-02 at 12 05 25 PM

Hello Dan,

I think the error was fixed in commit 23d2f04 , the raw_mat is now intialized properly.

I re-tested with the latest version (GitHub) on the HBCx22 (scChIP_mouse_PDX) and the scBED from Ku et al. downloaded from the Dropbox and it works fine.

So my suggestion would be to install the newest version from GitHub.
( devtools::install_github("vallotlab/ChromSCape") )

I also tested on the Bioconductor version 3.14 (R 4.1.3) and it is also working fine.

Sorry for the trouble,
Cheers,
Pacôme

Hi Dan,

There should be 3 files per sample (.mtx, barcodes.tsv and features.tsv).
There is an example of the files available here for the Buenorostro et al., 2018 scATAC seq ( DropBox ).

When uploading to ChromSCape, you should select the root of the directory, as for single-cell BED files (that is why the pop up explains for scBED only).

Cheers,
Pacôme

Hello Dan,

Indeed the current supported format for rownames is either "chr1:10000000-10005000" or "chr1_10000000_10005000". But in any case the "chr" characters have to be present.
This is why you get the error,
Best,
Pacôme

Hi Dan,
There is currently no way to provide a metadata with the sample but this would be a very interesting to implement.
If you currently want to add metadata, the best way would be to add it in the SingleCellExperiment's colData using ChromSCape functions in R directly.

Best,
Pacôme

Hi Dan,
Regarding the naming of the files, for the Sparse Matrix (10X format), each sample directory should contain :

  • ‘*barcodes.tsv’ : 1-column file of cell-barcode names (or .gz)
  • ‘*features.tsv’: Tab-separated file of feature genomic location (or .gz) (anything readable by rtracklayer::import.bed)
  • ‘*matrix.mtx’: 3 column space-separated file containing row index, column index and value of non-zeroes entries in the sparse matrix

However the exact regexp for the files are :

3 .*matrix .mtx
4 .*features .tsv
5 .*barcodes .tsv
7 .*features .txt
8 .*barcodes .txt
10 .*features .bed
13 .features ..gz
14 .barcodes ..gz
15 .matrix ..gz

Also, I was wrong earlier, the feature file has to be a tab-separated file and not 'chr1:100-200' or 'chr1_100_200' like format, unlike for DenseMatrix. The SparseMatrix example on the DropBox is not readable by ChromSCape so I fixed it, sorry about that.

Best,
Pacome