Post-processing : promote frequent variants as terms
Closed this issue · 2 comments
dcram commented
Reduce the number of variants for terms to max 10-15 variants by promoting frequent variants as terms.
dcram commented
Also in german, there are numerous size-3 words in top-10. Investigate why.
dcram commented
Post processing has been refactored, but instead of promoting frequent variants as terms, its 2-order variants are not displayed in variant bag, and a [+]
label is appended to the V
label in TSV so as to indicate it has variants.
T nn: wind turbine
V[S][+] nnn: horizontal-axis wind turbine
V[S] nnn: wind turbine rotor
V[S] nnn: wind turbine application
V[S] ann: offshore wind turbine
V[S] nnn: wind turbine sound
V[S][+] nnn: wind turbine blade
V[S] nnn: dfig wind turbine
V[S] annn: variable speed wind turbine
V[S][+] nnn: wind turbine noise
V[S][+] nnn: wind turbine concept