Does the generated latency count in the embedding lookup table and the last output layers ?

Question

Does the generated latency count in the embedding lookup table and the last output layers ?

leo038 opened this issue 3 years ago · 2 comments

According to the code, the generated latency should count in the embedding lookup table and the last output layers. But I find a problem, I train a predictor , and it is very accurate. Then I run the evo search with a hardware latency constraint of 200ms. After the subTransformer is trained, I test the latency, and the latency is 270ms, which is much larger than predicted latency. Why does this happen?

Answer 1 · 2021-11-07T02:18:39.000Z

Hi leo038,

Thanks for your question! Yes, it counts in the embedding lookup and last layer. I think there might be several reasons:

The measured latency should be averaged across many times of running to reduce variance
The dataset contains too few samples so does not cover a wide range of subTransformer architecture
Do you separate to train valid and test set for the latency predictor training and observe high accuracy on the test set?

Best,
Hanrui

Answer 2 · 2021-12-07T06:22:03.000Z

Hi leo038,

I will close the issue for now. Feel free to reopen if you have any further questions!

Best,
Hanrui