YonghaoXu/SSUN

About group strategy test accuracy in the paper

Closed this issue · 10 comments

I run the spectral group strategy(sls2=2)
batch_size=64 (in the paper)
np_epoch=500

each datasets run all 30 times
But I get the following result:

PaviaU: Mean oa:91.43 Std oa:1.61 Mean kappa: 90.15 Std Kappa 1.09
Indian_Pines: Mean oa:80.76 Std oa:2.77 Mean kappa: 79.42 Std Kappa 2.55
KSC: Mean oa:89.69 Std oa:1.12 Mean kappa: 89.13 Std Kappa 1.02

related codes in my experiment
HyperFunctions_sp.txt
SSUN_only_spectral.txt

@Snowzm Thank you for your interest in our work. It seems that the stds in your results are relatively large. Have you checked each classification result in those 30 runs? If there exist some runs that have higher OA values than our reported mean OA, then I think it is understandable considering the instability of the network training. Honestly, we also suffered from the instability issue in the network training for the spectral classification with LSTM model. Our initial experimental setting is to conduct 10 repetitions for each algorithm but we found it hard to yield a stable performance for the LSTM model. Therefore, we further increase the repetitions to 30 runs and get the results reported in the manuscript. Nevertheless, training the spectral module with the LSTM model alone is still very unstable. This may be the future direction that we could put efforts on.

Thanks for you detailed response sincelely.
I think the instability may be related to the spectral samples for lstm, after all only three time vectors were fed into it. Maybe data augmentation work it.
I also notice the params of lstm is huge, have you ever try to reduce the lstm params num?

Your inference sounds reasonable. You can also try to train the LSTM model with a smaller learning rate and more iteration. This may help to yield more robust results.
As for the number of nodes in the network, we just ensure that both the spectral feature embedding from the LSTM and the spatial feature embedding from the CNN share the same vector length (i.e., 128 in our implementation). I guess the performance won't be influenced too much by the number of nodes. According to our experimental results, the high OA value in SSUN mainly benefits from the CNN subnetwork, while the detailed segmentation map and well-maintained object boundaries benefit from the LSTM subnetwork. Thus, changing the number of parameters in the LSTM alone may not influence the final performance of SSUN too much.

Pretty reply! Thank you response again. I have an other question, why do not split 20% training data into validation as for best model selection or early stop. Personaly, I think it could improve the accuracy of the test sets

In our experiments, we just directly follow the experimental setting in the paper Deep Feature Extraction and Classification of Hyperspectral Images Based on Convolutional Neural Networks, but I agree with you that splitting an individual validation set may be a better way to search for the suitable hyper-parameters. There are also some literatures adopt this strategy.

Thanks a lot. I mean that using valid set to save the best valid model or make some learning rates auto-schedule in the training processing. Then use the best valid model to predict the test sets. After all, trains and test sets have identity distribution.

Sounds interesting. I think I get what you mean. Thank you for sharing your idea.

labels=sio.loadmat("Indian_pines_gt.mat")['indian_pines_gt']
palette=np.array([[255,0,0],[0,255,0],[0,0,255],[255,255,0],[0,255,255],[255,0,255],[176,48,96],[46,139,87],[160,32,240],[255,127,80],[127,255,212],[218,112,214],[160,82,45],[127,255,0],[216,191,216],[238,0,0]])
palette=palette*1.0/255
X_result=np.zeros((labels.shape[0],3))
num_class=labels.max()
or i in range(0,num_class):
X_result[np.where(labels==i+1),0]=palette[i,0]
X_result[np.where(labels==i+1),1]=palette[i,1]
X_result[np.where(labels==i+1),2]=palette[i,2]
X_result=np.reshape(X_result,(145,145,3))

I am getting the following error,
ValueError: cannot reshape array of size 435 into shape (145,145,3)

The var X_result is the prediction to the whole image rather than the origin gt.