alexeygrigorev/namespacediscovery-pipeline

make tokenization configurabe

Closed this issue · 1 comments

Provide a mechanism that tokenization can be switch off, and the full description like "Cauchy stress tensor" would be used.

added another way of incorporating definitions into the vector space - "FULL".

With stemming switched on, the values for $X$ is "linear combination" and $k$ is "random vector" in this vector space look like X_linear combin and k_random vector.

To use it, set isv_type to full in luigi.cfg