appeler/ethnicolr

Why training mode set True for prediction purpose?

mocherson opened this issue · 6 comments

pdf = pdf.append(pd.DataFrame(cls.model(X, training=True)))

Why here set Training=True. It should be for prediction purpose.

can you say more? we may be missing something.

the function is for prediction.

"To achieve this in keras, we have to use the functional API and setup dropout this way: Dropout(p)(input_tensor, training=True)"

https://www.depends-on-the-definition.com/model-uncertainty-in-deep-learning-with-monte-carlo-dropout/

@soodoku Thanks for your reply. In the link you provided, they use mc_model.predict, it makes sense since it automaticlly set the mode to an evaluation mode. They set training=True explicitly in the dropout layer, this can ensure the randomness even in the evaluation mode.

model(X, training=True) is OK for your purpose to get different output for a fixed input, since model.predict will get a constant output. The disadvagtage is that model(X, training=True) sets the whole model in a training mode where the model will do more cumputation and cost more memory than in a evaluation mode. It is better to provide a function for constant output with model.predict.

@mocherson: can you propose a way to get dropout-based uncertainty estimates, which we provide, without the training = True?
or is your point that we should provide an option to not get the uncertainty estimates and simply get the predictions?

@soodoku I cannot figure out a better way to get dropout-based uncertainty estimates without training=True except rebuild the model and set training=True explicitly in the dropout layer. But it is more easier to provide an option to not get the uncertainty estimates and simply get the predictions (see below). The latter is more commonly used.

def pred(
    df, newnamecol, cls, VOCAB, RACE, MODEL, NGRAMS, maxlen, num_iter, conf_int):

    df[newnamecol] = df[newnamecol].str.strip().str.title()

    if cls.model is None:
        vdf = pd.read_csv(VOCAB)
        cls.vocab = vdf.vocab.tolist()

        rdf = pd.read_csv(RACE)
        cls.race = rdf.race.tolist()

        cls.model = load_model(MODEL)

    # build X from index of n-gram sequence
    X = np.array(df[newnamecol].apply(lambda c: find_ngrams(cls.vocab,
                                                            c, NGRAMS)))
    X = sequence.pad_sequences(X, maxlen=maxlen)

    # Predict
    pdf = pd.DataFrame(cls.model(X, training=False).numpy(), columns = cls.race)
    print(cls.race)

    final_df = pd.concat([df.reset_index(drop=True),pdf.reset_index(drop=True)],axis=1 )

    return final_df

@mocherson: do you want to do a PR? would be awesome.

@mocherson, please check out the latest version (0.9.x) you will not get the uncertainty estimates by default. Thanks for your feedback.