`model_params` do not seem to work
JBGruber opened this issue · 0 comments
JBGruber commented
Turning off the GPU should signigicantly slow down the answering. But it doesn't:
library(rollama)
res <- bench::mark(
cpu = {query("why is the sky blue?",
model_params = list(num_gpu = 0))},
gpu = query("why is the sky blue?"),
check = FALSE
)
summary(res)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 cpu 4.9s 4.9s 0.204 4.98MB 0
#> 2 gpu 6.09s 6.09s 0.164 545.33KB 0.164
Created on 2024-01-23 with reprex v2.0.2
Using
curl http://localhost:11434/api/generate -d '{
"model": "llama2",
"prompt": "Why is the sky blue?",
"stream": false,
"options": {
"seed": 42,
"num_gpu": 0
}
}'
The time ollama takes goes up to 50s, which makes more sense. I assume parameters are not translated to JSON correctly.