Data lost with STAGATE_pyG
Closed this issue · 2 comments
Describe the bug
When I use STAGATE and the error happed.
The code I ran:
STA_obj=ov.space.pySTAGATE(adata,
num_batch_x=1,
num_batch_y=1,
spatial_key=['X','Y'],
rad_cutoff=75,
device='cpu')
And I tried raw tools of STAGATE, it's all right:
Screenshots
The data processed by Batch_Data was strange.
The red line word was added for print.
The error message:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[10], line 8
1 #STA_obj=ov.space.pySTAGATE(adata,num_batch_x=3,num_batch_y=2,
2 # spatial_key=['X','Y'],
3 # rad_cutoff=50,
(...)
6 # weight_decay=1e-4,hidden_dims = [512, 30],
7 # device='cpu')
----> 8 STA_obj=ov.space.pySTAGATE(adata,
9 num_batch_x=1,
10 num_batch_y=1,
11 spatial_key=['X','Y'],
12 rad_cutoff=75,
13 device='cpu')
File /conda/envs/omicverse/lib/python3.10/site-packages/omicverse/space/_cluster.py:26, in pySTAGATE.__init__(self, adata, num_batch_x, num_batch_y, spatial_key, batch_size, rad_cutoff, num_epoch, lr, weight_decay, hidden_dims, device)
24 for temp_adata in Batch_list:
25 print(temp_adata)
---> 26 Cal_Spatial_Net(temp_adata, rad_cutoff=rad_cutoff)
29 #device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
30 data_list = [Transfer_pytorch_Data(adata) for adata in Batch_list]
File /conda/envs/omicverse/lib/python3.10/site-packages/omicverse/externel/STAGATE_pyG/utils.py:83, in Cal_Spatial_Net(adata, rad_cutoff, k_cutoff, model, verbose)
80 coor.columns = ['imagerow', 'imagecol']
82 if model == 'Radius':
---> 83 nbrs = sklearn.neighbors.NearestNeighbors(radius=rad_cutoff).fit(coor)
84 distances, indices = nbrs.radius_neighbors(coor, return_distance=True)
85 KNN_list = []
File /conda/envs/omicverse/lib/python3.10/site-packages/sklearn/base.py:1474, in _fit_context.<locals>.decorator.<locals>.wrapper(estimator, *args, **kwargs)
1467 estimator._validate_params()
1469 with config_context(
1470 skip_parameter_validation=(
1471 prefer_skip_nested_validation or global_skip_validation
1472 )
1473 ):
-> 1474 return fit_method(estimator, *args, **kwargs)
File /conda/envs/omicverse/lib/python3.10/site-packages/sklearn/neighbors/_unsupervised.py:175, in NearestNeighbors.fit(self, X, y)
154 @_fit_context(
155 # NearestNeighbors.metric is not validated yet
156 prefer_skip_nested_validation=False
157 )
158 def fit(self, X, y=None):
159 """Fit the nearest neighbors estimator from the training dataset.
160
161 Parameters
(...)
173 The fitted nearest neighbors estimator.
174 """
--> 175 return self._fit(X)
File /conda/envs/omicverse/lib/python3.10/site-packages/sklearn/neighbors/_base.py:518, in NeighborsBase._fit(self, X, y)
516 else:
517 if not isinstance(X, (KDTree, BallTree, NeighborsBase)):
--> 518 X = self._validate_data(X, accept_sparse="csr", order="C")
520 self._check_algorithm_metric()
521 if self.metric_params is None:
File /conda/envs/omicverse/lib/python3.10/site-packages/sklearn/base.py:633, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, cast_to_ndarray, **check_params)
631 out = X, y
632 elif not no_val_X and no_val_y:
--> 633 out = check_array(X, input_name="X", **check_params)
634 elif no_val_X and not no_val_y:
635 out = _check_y(y, **check_params)
File /conda/envs/omicverse/lib/python3.10/site-packages/sklearn/utils/validation.py:1072, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
1070 n_samples = _num_samples(array)
1071 if n_samples < ensure_min_samples:
-> 1072 raise ValueError(
1073 "Found array with %d sample(s) (shape=%s) while a"
1074 " minimum of %d is required%s."
1075 % (n_samples, array.shape, ensure_min_samples, context)
1076 )
1078 if ensure_min_features > 0 and array.ndim == 2:
1079 n_features = array.shape[1]
ValueError: Found array with 0 sample(s) (shape=(0, 2)) while a minimum of 1 is required by NearestNeighbors.
Desktop (please complete the following information):
- OS: Debian
- Version 1.6.2
We encapsulate the preprocessing step in pySTAGATE, and this error may be reported as a result of your secondary preprocessing's error.
Do you encounter the same error when using our example data?
Zehua
We encapsulate the preprocessing step in pySTAGATE, and this error may be reported as a result of your secondary preprocessing's error.
Do you encounter the same error when using our example data?
Zehua
Hi Zehua,
Thanks for responese. I tried the example data. And the same error occurred.
I tried to debug. I found the problem caused by code
adata.obs['X'] = adata.obsm['spatial'][:,0]
adata.obs['Y'] = adata.obsm['spatial'][:,1]
adata.obs
It caused adata.obs['X'] as NA.
But I splited the code as two panel as following, the problem solved. It's very strange, but it solved .Thanks