Andyccs/cifar10

Fundamental Mistake

Opened this issue · 4 comments

I think I made some fundamental mistake in my codes. No matter what technique I use to train my model, it always stuck at 10%. It could be because I mess up with the way I extract data to matrices in images_to_matrices.py.

My intuition tell me that, even if we use simpler techniques such as logistic regression and multinomial logistic regression, we should get something better than 10%. So before we try to test something more advance like convolution neural network, we should get those easier method to work first, at least not 10%.

I am pretty sure we don't have much problems in algorithm part, coz I basically use copy paste from tensorflow course.

I found out one small but significant error in my multinomial_logistic_regression.py

Currently, I define my model as:

def model(data, weights, biases):
  return tf.matmul(data, weights) + biases

But actually, the model is more suitable for regression problem, not logistic regression. So to fix this problem, my model should define as follow:

def model(data, weights, biases):
  return tf.sigmoid(tf.matmul(data, weights) + biases)

I also found out that AdamOptimizer converge to minimum loss and maximum accuracy much faster than GradientDescentOptimizer, but I still don't understand Adam Optimizer.

If you examine the tf.nn.softmax_cross_entropy_with_logits function, it is actually compute the following

loss = - (1 / number of output) * sum( y * log(predicted_y) )

This means that, if y=1, we penalised the model with log(predicted_y), but when y=0, we do not penalised the model. A better way to define our loss function could be

loss = - (1 / number of output) * sum( y * log(predicted_y) + (1-y) * log (1 - predicted_y) )