Issues
- 4
How to extract the Token Embedding
#37 opened by biniyoni - 2
- 7
- 2
Word Splitting
#29 opened by chapmanjacobd - 4
Doubts about utility to multilingual models
#30 opened by TheMrguiller - 2
Matryoshka Representations Evaluation
#32 opened by KyleSmith19091 - 3
Feature / Add Semantic Splitting
#19 opened by dleemiller - 2
tokenizer = Tokenizer.from_file(str(tokenizer_path)) Exception: data did not match any variant of untagged enum PyNormalizerTypeWrapper at line 49 column 3
#16 opened by gfkdliucheng - 4
A example of using WordLlama for a RAG pipeline
#25 opened by dinhanhx - 8
wl.embed, wl.cluster high RAM usage
#17 opened by chapmanjacobd - 3
How do you really create WordLlama model?
#20 opened by dinhanhx - 1
The example does not work
#21 opened by tumikosha - 4
Gradio Demo
#12 opened by amrrs - 3
- 2
ModuleNotFoundError: No module named 'wordllama.algorithms.kmeans_helpers'
#13 opened by chapmanjacobd - 3
First README example fails
#9 opened by cpa