RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggered

Question

RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggered

AndDoIt opened this issue 3 years ago · 4 comments

During the training with my Chinese dataset, it always occurred the following error, could you please help me to solve it?

Answer 1 · 2022-01-03T16:37:51.000Z

The affected code line extracts the classifier token's ([CLS]) embedding. The [CLS] token is added during dataset loading (see input_reader.py file). I'm not sure why the error occurs with your dataset. Maybe you already include the [CLS] token in your dataset JSON file? And are you sure no other character (or subword) is mapped to the same ID as [CLS] (in your vocabulary)?

Answer 2 · 2022-01-04T15:05:51.000Z

Thanks for your reply very much!
Following your guide, I checked _parse_tokens function in input_reader.py with doc_encoding.count(101), and there is only 1 [CLS] token. Since I have 35 kinds of relations and 25 kinds of entities, the actual affected code line is self.size_embeddings = nn.Embedding(100, size_embedding), and I changed 100 to my max_sent_len of 200, then it has solved.

Answer 3 · 2022-01-05T14:05:35.000Z

Okay, thank you!

Answer 4 · 2022-01-13T11:07:45.000Z

I'll leave this issue open till I added the maximum size embedding count as a configuration parameter.