Reproducing the accuracy results
HuangPZ opened this issue · 10 comments
Hi,
So I saw the previous discussion on #9 and I made some edit to the functions that I could get reasonable results for SecureML and Sarda (~96% each). However I still could not get good results for LeNet and MiniONN. So just want to know how the accuracy results in the paper were created? Was there something like quantization scripts? Thanks!
Could you tell me which results are you trying to reproduce in reference to the paper? I don't remember doing anything special for two of the four networks.
It's table 6 of the paper. I meant that since getAccuracy() was not working in the original code, I tried to implement it myself and I only got two of the networks right. So just want to know how table 6 was produced. Thx!
Those results were processed by running the network within MPC but reconstructing the out in the clear and then processing them. Some useful scripts should be in the scripts folder. It's great that you reproduced at least two of them using a secure computation getAccuracy() implementation. Can you tell me what is not working in the LeNet/MiniONN results?
The scripts produce the same results as the new getAccuracy(). Generally, LeNet/MiniONN results are like random with only 10+ correctly classified.
I am interested in how you edit the code to reproduce the result in SecureML, as I used quite a lot of time but still cannot reproduce the result... Thank you very much!
Right, then it seems that it is producing random results. Can you narrow down where the error might be? Is it in the getAccuracy() function or the trained network?
Would be great if you can help @AndesPooh258 with his question too.
Any updates? Could you @HuangPZ help point out the part of codes that you edit to get reasonable results for SecureML and Sarda? Below is the accuracy figure for MNIST+SecureML I got according to the current master branch code.
Hi, it's complicated to make a pull request from my code now since I made other changes, but here's something I did to get the accuracy:
- I also saved labels in the ipynb scripts (just for convinience
path = "./LeNet/"
import os
if not os.path.isdir(path):
os.mkdir(path)
np.savetxt(fname=path+"label", delimiter=" ", X=label.tolist())
print(get_acc(model, ([img,label],)))
- My predict and getAccuracy looks like this:
void NeuralNetwork::predict(vector<myType> &maxIndex)
{
log_print("NN.predict");
size_t rows = MINI_BATCH_SIZE;
size_t columns = LAST_LAYER_SIZE;
RSSVectorMyType max(rows);
RSSVectorSmallType maxPrime(rows*columns);
funcMaxpool(*(layers[NUM_LAYERS-1]->getActivation()), max, maxPrime, rows, columns);
vector<smallType> reconst_maxPrime(maxPrime.size());
vector<myType> reconst_max(max.size());
// funcReconstruct(max, reconst_max, rows, "max", false);
funcReconstructBit(maxPrime, reconst_maxPrime, rows*columns, "maxP", false);
for (int i=0;i<rows;i++){
for(int j=0;j<columns;j++){
int a = reconst_maxPrime[i*columns+j];
if(reconst_maxPrime[i*columns+j]){
maxIndex[i]=j;
}
}
}
}
void NeuralNetwork::getAccuracy(const vector<myType> &maxIndex, vector<size_t> &counter, string network)
{
log_print("NN.getAccuracy");
size_t rows = MINI_BATCH_SIZE;
size_t columns = LAST_LAYER_SIZE;
string path_input = "files/preload/"+which_network(network)+"/label";
ifstream f_input(path_input);
vector<myType> label(MINI_BATCH_SIZE);
string temp;
for (int i = 0; i < MINI_BATCH_SIZE; ++i)
{
f_input >> temp;
label[i] = std::stof(temp);
}
RSSVectorMyType temp_max(rows), temp_groundTruth(rows);
for (size_t i = 0; i < MINI_BATCH_SIZE; ++i)
{
counter[1]++;
if (label[i] == maxIndex[i])
counter[0]++;
}
cout << "Rolling accuracy: " << counter[0] << " out of "
<< counter[1] << " (" << (counter[0]*100/counter[1]) << " %)" << endl;
}
You may have a try on these and let me know if it also works for you. @AndesPooh258 @llCurious
@HuangPZ Got it!
It seems that the getAccuracy
measures the accuracy of a trained model, instead of end-to-end secure training?
Have you tried to train the model (i.e., SecureML) and then get the accuracy? Myabe the problem happens during the training which does not converge.
Based on my current result, it seems like some overflow errors occur during the secure training, which affect the accuracy (I am still not free to do further debugging). But anyway, thank you for your advice!