How to use GPU to accelerate?
Opened this issue · 0 comments
Recycle1 commented
when I want to use cuda to run the sample code, there is a warning, and the result is quite different from the cpu's, how should I do?
output:
D:\python_about\pre-trained_models\modules\transformers_modules\vikhyatk\moondream2\92d3d73b6fd61ab84d9fe093a9c7fd8c04bf2c0d\vision_encoder.py:71: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.)
x = F.scaled_dot_product_attention(q, k, v)