/tzlink-ncbi-disease-tracker

Track experimental results from disease normalization using the NCBI disease dataset

Primary LanguagePython

Disease Normalization with NCBI Disease

Experimental progress over time.

Date Changes Accuracy Acc. reach. Compare Commit
2018-06-01 10:25:44 automatic result recording 0.6645 0.9074 diff original
2018-06-04 07:51:20 no oracle in training 0.7027 0.9140 diff original
2018-06-04 08:18:19 predict from 1000 candidates 0.3037 0.3361 diff original
2018-06-04 08:31:27 train and predict with 1000 candidates 0.1309 0.1449 diff original
2018-06-05 13:57:01 represent mentions with subword units 0.6645 0.8645 diff original
2018-06-07 14:45:37 candidate generation: similarity of phrase vectors 0.5044 0.7548 diff original
2018-06-07 14:52:36 subword-unit embeddings for candidate generation and CNN 0.3062 0.6770 diff original
2018-06-07 15:05:48 both skip-gram overlap and phrase-embedding for candidate gen. 0.5972 0.7165 diff original
2018-06-08 10:31:56 candidate generators provide a score for each candidate 0.6874 0.8272 diff original
2018-06-08 10:38:22 switch back to using only skip-gram candidates 0.6811 0.886 diff original
2018-06-11 11:01:30 new candidate generator based on actual cosine similarity 0.7433 0.8616 diff original
2018-06-11 11:20:25 use candidates from cosine similarity only 0.77 0.8951 diff original
2018-06-13 09:28:32 fix evaluation mistake: "|"-separated reference IDs are composite, not alternatives 0.756 0.9043 diff original
2018-06-20 10:21:15 use both word and subword-units 0.7395 0.8845 diff original
2018-06-20 10:26:55 use subword-unit embeddings only 0.756 0.9043 diff original
2018-06-20 10:31:31 trainable subword-unit embeddings 0.756 0.9043 diff original
2018-07-28 20:24:58 more comprehensive word embeddings 0.7713 0.9225 diff original
2018-07-31 07:38:04 use stemmed embeddings for cand. gen. 0.7827 0.9419 diff original
2018-07-31 08:07:39 cand. gen. with stemmed emb. + s-gram cos. 0.7789 0.8975 diff original
2018-07-31 10:39:34 adam optimizer 0.7891 0.8974 diff original
2018-07-31 12:08:43 add an input node for token overlap 0.8005 0.9104 diff original
2018-08-06 08:59:47 hyperonym cand. gen. 0.8145 0.9105 diff original
2018-08-06 15:51:28 abbreviation cand. gen. 0.8348 0.9075 diff original
2018-08-07 09:48:01 omit rank_score, use bare cand-gen scores 0.8145 0.8854 diff original
2018-08-09 08:53:52 allow multiple filter widths in the convolution 0.8208 0.8923 diff original
2019-01-27 11:01:45 classification with vocabulary pretraining 0.859 -- diff original