jina-ai/late-chunking

Late chunking gets bad result when use Vietnamese Embedding Model

Closed this issue · 1 comments

Thank you for the great work! I have tried using late chunking with the Vietnamese Embedding Model dangvantuan/vietnamese-embedding-LongContext, but I've observed poor results compared to traditional chunking methods.

I've included my Colab notebook here. If you have some time, could you please take a look? Your feedback would be greatly appreciated!

Thank you!

This model is using CLS pooling as far as I can see. Unfortunately, late chunking only works with models that are trained with mean pooling.