I have gpu and I expect to run model faster, but your code is only for cpu? how to change it?
alexhmyang opened this issue · 2 comments
alexhmyang commented
as for the quesiton:
Requirements
GPU is not used and is not required.
I have gpu and I expect to run model faster, but your code is only for cpu? how to change it?
wafflecomposite commented
Until I make some updates, check out this fork
https://github.com/sebaxzero/LangChain_PDFChat_Oobabooga
I haven't tried it myself, but looks like it should be capable to utilize GPU.
LebToki commented
You could try this approach!
Haven't tested it yet though
GPU Based Approach
llm = LlamaCpp(
model_path="./models/stable-vicuna-13B.ggmlv3.q8_0.bin",
stop=["### Human:"],
callback_manager=callback_manager,
verbose=True,
n_ctx=2048,
n_batch=512,
device="cuda" # Specify the GPU device (e.g., "cuda:0")
)