Fundamental Mistake
Opened this issue · 4 comments
I think I made some fundamental mistake in my codes. No matter what technique I use to train my model, it always stuck at 10%. It could be because I mess up with the way I extract data to matrices in images_to_matrices.py
.
My intuition tell me that, even if we use simpler techniques such as logistic regression and multinomial logistic regression, we should get something better than 10%. So before we try to test something more advance like convolution neural network, we should get those easier method to work first, at least not 10%.
I am pretty sure we don't have much problems in algorithm part, coz I basically use copy paste from tensorflow course.
I found out one small but significant error in my multinomial_logistic_regression.py
Currently, I define my model as:
def model(data, weights, biases):
return tf.matmul(data, weights) + biases
But actually, the model is more suitable for regression problem, not logistic regression. So to fix this problem, my model should define as follow:
def model(data, weights, biases):
return tf.sigmoid(tf.matmul(data, weights) + biases)
I also found out that AdamOptimizer
converge to minimum loss and maximum accuracy much faster than GradientDescentOptimizer
, but I still don't understand Adam Optimizer.
If you examine the tf.nn.softmax_cross_entropy_with_logits
function, it is actually compute the following
loss = - (1 / number of output) * sum( y * log(predicted_y) )
This means that, if y=1, we penalised the model with log(predicted_y)
, but when y=0, we do not penalised the model. A better way to define our loss function could be
loss = - (1 / number of output) * sum( y * log(predicted_y) + (1-y) * log (1 - predicted_y) )