Starlitnightly/omicverse

Data lost with STAGATE_pyG

Closed this issue · 2 comments

Describe the bug
When I use STAGATE and the error happed.
The code I ran:

STA_obj=ov.space.pySTAGATE(adata,
                           num_batch_x=1,
                           num_batch_y=1,
                           spatial_key=['X','Y'],
                           rad_cutoff=75,
                device='cpu')

And I tried raw tools of STAGATE, it's all right:
image

Screenshots
The data processed by Batch_Data was strange.
image
The red line word was added for print.
image
The error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[10], line 8
      1 #STA_obj=ov.space.pySTAGATE(adata,num_batch_x=3,num_batch_y=2,
      2 #                 spatial_key=['X','Y'],
      3 #                           rad_cutoff=50,
   (...)
      6 #                weight_decay=1e-4,hidden_dims = [512, 30],
      7 #                device='cpu')
----> 8 STA_obj=ov.space.pySTAGATE(adata,
      9                            num_batch_x=1,
     10                            num_batch_y=1,
     11                            spatial_key=['X','Y'],
     12                            rad_cutoff=75,
     13                 device='cpu')

File /conda/envs/omicverse/lib/python3.10/site-packages/omicverse/space/_cluster.py:26, in pySTAGATE.__init__(self, adata, num_batch_x, num_batch_y, spatial_key, batch_size, rad_cutoff, num_epoch, lr, weight_decay, hidden_dims, device)
     24 for temp_adata in Batch_list:
     25     print(temp_adata)
---> 26     Cal_Spatial_Net(temp_adata, rad_cutoff=rad_cutoff)
     29 #device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
     30 data_list = [Transfer_pytorch_Data(adata) for adata in Batch_list]

File /conda/envs/omicverse/lib/python3.10/site-packages/omicverse/externel/STAGATE_pyG/utils.py:83, in Cal_Spatial_Net(adata, rad_cutoff, k_cutoff, model, verbose)
     80 coor.columns = ['imagerow', 'imagecol']
     82 if model == 'Radius':
---> 83     nbrs = sklearn.neighbors.NearestNeighbors(radius=rad_cutoff).fit(coor)
     84     distances, indices = nbrs.radius_neighbors(coor, return_distance=True)
     85     KNN_list = []

File /conda/envs/omicverse/lib/python3.10/site-packages/sklearn/base.py:1474, in _fit_context.<locals>.decorator.<locals>.wrapper(estimator, *args, **kwargs)
   1467     estimator._validate_params()
   1469 with config_context(
   1470     skip_parameter_validation=(
   1471         prefer_skip_nested_validation or global_skip_validation
   1472     )
   1473 ):
-> 1474     return fit_method(estimator, *args, **kwargs)

File /conda/envs/omicverse/lib/python3.10/site-packages/sklearn/neighbors/_unsupervised.py:175, in NearestNeighbors.fit(self, X, y)
    154 @_fit_context(
    155     # NearestNeighbors.metric is not validated yet
    156     prefer_skip_nested_validation=False
    157 )
    158 def fit(self, X, y=None):
    159     """Fit the nearest neighbors estimator from the training dataset.
    160 
    161     Parameters
   (...)
    173         The fitted nearest neighbors estimator.
    174     """
--> 175     return self._fit(X)

File /conda/envs/omicverse/lib/python3.10/site-packages/sklearn/neighbors/_base.py:518, in NeighborsBase._fit(self, X, y)
    516 else:
    517     if not isinstance(X, (KDTree, BallTree, NeighborsBase)):
--> 518         X = self._validate_data(X, accept_sparse="csr", order="C")
    520 self._check_algorithm_metric()
    521 if self.metric_params is None:

File /conda/envs/omicverse/lib/python3.10/site-packages/sklearn/base.py:633, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, cast_to_ndarray, **check_params)
    631         out = X, y
    632 elif not no_val_X and no_val_y:
--> 633     out = check_array(X, input_name="X", **check_params)
    634 elif no_val_X and not no_val_y:
    635     out = _check_y(y, **check_params)

File /conda/envs/omicverse/lib/python3.10/site-packages/sklearn/utils/validation.py:1072, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
   1070     n_samples = _num_samples(array)
   1071     if n_samples < ensure_min_samples:
-> 1072         raise ValueError(
   1073             "Found array with %d sample(s) (shape=%s) while a"
   1074             " minimum of %d is required%s."
   1075             % (n_samples, array.shape, ensure_min_samples, context)
   1076         )
   1078 if ensure_min_features > 0 and array.ndim == 2:
   1079     n_features = array.shape[1]

ValueError: Found array with 0 sample(s) (shape=(0, 2)) while a minimum of 1 is required by NearestNeighbors.

Desktop (please complete the following information):

  • OS: Debian
  • Version 1.6.2

We encapsulate the preprocessing step in pySTAGATE, and this error may be reported as a result of your secondary preprocessing's error.

Do you encounter the same error when using our example data?

Zehua

We encapsulate the preprocessing step in pySTAGATE, and this error may be reported as a result of your secondary preprocessing's error.

Do you encounter the same error when using our example data?

Zehua

Hi Zehua,
Thanks for responese. I tried the example data. And the same error occurred.
I tried to debug. I found the problem caused by code

adata.obs['X'] = adata.obsm['spatial'][:,0]
adata.obs['Y'] = adata.obsm['spatial'][:,1]
adata.obs

It caused adata.obs['X'] as NA.
But I splited the code as two panel as following, the problem solved. It's very strange, but it solved .Thanks
image