Check plotEmbedding after `dsig.create_features`
Closed this issue · 1 comments
kasra-hosseini commented
In the notebook, after encoding the text data, we can plot the embeddings:
which seems reasonable as we have three classes. In fact, we have four classes, but we map them into three:
"label":
{"economy": 2,
"obama": 1,
"microsoft": 0,
"palestine": 0
}
However, after time injection and
x_data = dsig.create_features(path, sig_combined, last_index_dt_all, bert_embeddings, time_feature)
The results look very different (see the notebook). Do I miss anything?
rchan26 commented
Interesting point from meeting with @kasra-hosseini: potential reason why this happens might be because in this example, we are adding random time-stamps / time-ids and this could be why we're getting strange results which don't seem to cluster very well...