Problem with prediction of non-linear SVM

Question

Problem with prediction of non-linear SVM

FelipeBerrios opened this issue 7 years ago · 3 comments

When I try to predict for new entries using the test set, it seems to depend on how much data is passed to the prediction operation. For example, if you only pass one entry data, it always predicts the same class, even though it is erroneous. As you pass more test data, you begin to predict different values.
My question is whether this behavior is normal ?. In addition the results seem to depend on the random values rand_x and rand_y. Thanks for the help.

Answer 1 · 2017-08-25T01:01:34.000Z

I realized that if only one test sample is passed to it, the prediction always returns a vector with zero values since it subtracts prediction_output with itself, so tf.argmax always returns the index zero.

prediction_output = tf.matmul(tf.multiply(y_target,b), pred_kernel)
prediction = tf.arg_max(prediction_output-tf.expand_dims(tf.reduce_mean(prediction_output,1), 1), 0)

Then by passing only one element to prediction_grid, the subtraction returns:

[[ 0.]
[ 0.]
[ 0.]]

And prediction takes the value [0]

Answer 2 · 2018-03-21T20:36:13.000Z

Hi @FelipeBerrios , This is interesting, thanks for finding this- and sorry for the late reply. I'm just getting around to triaging all the issues in preparation for a v2 of the code.

I'll probably get around to chapter 4 in the next month, so I'll look into this then. Thanks again.

Answer 3 · 2018-04-09T01:17:11.000Z

Hi @FelipeBerrios , I can reproduce this error. I'm not sure on the best way to fix it. It does seem that when a vector of size [1] is passed, it does predict 0 always. But for multiple examples it works.

I got around this by repeating the input example more than once (batch_size) times.

I suspect this is because of how TensorFlow handles broadcasting. I'm not sure of the best way to handle this, but making a separate prediction graph only if the sample size is [1] seems overkill. I suggest that anyone running into this issue just replicate the prediction point and get the predictions that way. I'll make a note of it in the code.

I'll close this for now, but if you find a better solution please reopen.