Importing Mass Spec Imaging Data into SpatialData
Opened this issue · 8 comments
I am currently trying to import mass spec imaging data by converting this to a .h5 to mimic the Visium file formatting. I have reached a point where the error is: ValueError: Versions older than V3 are not supported. Are there any suggestions for how to work with alternatively formatted data or mimicing the Visium V3 formatting? My data contains X and Y coordinates as well as feature intensities for each spot in my mass spec imaging dataset. This data is for spatial glycans, so I cannot easily use the metaspace_converter package. Thanks!
@aeisenbarth can you be of help here?
I have reached a point where the error is: ValueError: Versions older than V3 are not supported
In general, for being able to help it is useful to know what tools you used and what steps you did to reach that point.
I don't have experience with Visium data but I can describe how I have stored mass spec imaging data.
-
Depending on what file format you get from the mass spectrometer, you need to be able to read it with Python. For example:
- RAW → imzML/ibd: imzMLConverter
- imzML/ibd → Python: pyimzML (docs)
-
Then I would parse the X, Y coordinates into a SpatialData points element.
-
The ion intensities can be stored as annotations in a table (where region refers to the points element, and instances to the index of the points, that should be an enumeration like [0,1,2,…]). In the table, the
X
matrix is most suitable for ion intensities (because scanpy clustering algorithms would access them there, and it's better not to mix them with otherobs
columns). Invar
, you would store the glycan names as index (=var_names
), and potentially other metadata like m/z and ion formula. In the end, each row of theX
matrix contains the intensities of all glycans measured for the IMS pixel of the row's index.
Is there interest in expanding support for mass spectrometry imaging data, in SpatialData. Our institute produces a lot of MSI data and we want to start integrating it with other spatial omics as it can give a plethora of additional information. It would be nice if there are more interested parties that can pool efforts.
Ah, came here to mention that @Tomatokeftes had joined the Zarr meeting last night. 😄 Carry on!
Caught. Yes, I was going to make an issue as you suggested but since the final result is SpatialData, I chanced upon this. I am currently working on reading and extracting information from the proprietary formats but the resulting structure of the zarr format is a little bit on the air, except some ome-ngff specification following. We would be interested in potentially collaborating and coming up with a better format than the current imzML. I can still make an issue on the python zarr, more focused on the sparsity support, but that will depend also on how the format finalizes itself.
@Tomatokeftes thanks for reaching out, I followed via e-mail for my availability for a meeting. If anyone is interested in the project please reach out to @Tomatokeftes or me via Zulip.
@blakesells7 I recommend the same approach as described by @aeisenbarth. In general I would advise against fitting the MSI data into a Visium object, as the visium()
reader makes specific assumptions on the data (for instance the error you are getting is because the reader expects to find information on which SpaceRanger version was used to generate the data). I would instead create a SpatialData
object from scratch; in the process, feel free to reuse code from spatialdata-io
as needed.
Thanks everyone for your comments and thoughts! I will look at @aeisenbarth 's method and try to adapt it for my data. I also have export CSV's with features, x-y coordinates, and intensities that I will also try to adapt for this format.