imsb-uke/scGAN

Invalid argument: Name: <unknown>, Feature: cluster_int (data type: int64) is required but could not be found.

Closed this issue · 1 comments

Hi,
While I did managed to pre-process data and train the model with a dataset with no predefined clusters, I am having issues using previously clustered data.
My dataset has 27 clusters (0 to 26) and I am getting an error that I can't solve while training. I am getting the following error:

Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Name: <unknown>, Feature: cluster_int (data type: int64) is required but could not be found.
     [[Node: read_batch_features/ParseExample/ParseExample = ParseExample[Ndense=1, Nsparse=2, Tdense=[DT_INT64], dense_shapes=[[1]], sparse_types=[DT_INT64, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](read_batch_features:1, read_batch_features/ParseExample/ParseExample/names, read_batch_features/ParseExample/ParseExample/sparse_keys_0, read_batch_features/ParseExample/ParseExample/sparse_keys_1, read_batch_features/ParseExample/ParseExample/dense_keys_0, read_batch_features/ParseExample/Const)]]
2021-06-04 09:18:07.339494: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at example_parsing_ops.cc:144 : Invalid argument: Name: <unknown>, Feature: cluster_int (data type: int64) is required but could not be found.

I have find out that the program crashes in this line of code from the training function in cscGAN:
results = sess.run(model_fetches, feed_dict=train_feed_dict)
I was wondering if you had a hint about this problem and how to solve it, I'd really appreciate it!

Best regards,
Álvaro

@Alvaro-CS and I discussed this through emails.
In short, there was an issue on his end with the encoding of the strings that compose the cluster indices categories.
In case someone stumbles across the same error, please make sure that your cluster annotations are encoded as non-encoded strings when you add them to your anndata object. If they are encoded, you can apply the following line to solve the issue:
andata.obs['cluster'].apply(lambda x: x.decode('UTF-8'))