input data of dataset.Dataset()

Question

input data of dataset.Dataset()

Bo-UT opened this issue a year ago · 1 comments

Hi,

I am new to spatial transcriptomic analysis. I have AnnData from 10x Xenium. Could you please let me know how to generate the input data for dataset.Dataset()?
Thanks.

Bo

Answer 1 · 2023-09-25T14:13:50.000Z

Hi,

Thanks for your interest in FISHscale. You should format your data like this .parquet file: https://figshare.com/articles/dataset/EEL_mouse_sagittal_440_gene_RNA_spatial_data/20324820?file=37548382

Probably easiest is to convert your AnnData file to a Pandas Dataframe. This dataframe should contain the XY or XYZ locations of all molecules and their gene label. So, make a dataframe with the columns: 'g' for gene labels, 'x' for X coordinates and 'y' for Y coordinates. Then save this dataframe as .parquet using" df.to_parquet(filename.parquet), wheredf is your dataframe.

Then you can give FISHscale the path to this file and the column names like so:

from FISHscale.utils import dataset
d = dataset.Dataset('filename.parquet',
                     x_label = 'x',
                     y_label = 'y',                
                     gene_label = 'g',
                     pixel_size = '1 micrometer', #Change this to the unit of the Xenium data.