Late chunking gets bad result when use Vietnamese Embedding Model

Question

Late chunking gets bad result when use Vietnamese Embedding Model

Closed this issue 2 months ago · 1 comments

Thank you for the great work! I have tried using late chunking with the Vietnamese Embedding Model dangvantuan/vietnamese-embedding-LongContext, but I've observed poor results compared to traditional chunking methods.

I've included my Colab notebook here. If you have some time, could you please take a look? Your feedback would be greatly appreciated!

Thank you!

Answer 1 · 2024-10-05T12:36:04.000Z

This model is using CLS pooling as far as I can see. Unfortunately, late chunking only works with models that are trained with mean pooling.