wafflecomposite/langchain-ask-pdf-local

I have gpu and I expect to run model faster, but your code is only for cpu? how to change it?

alexhmyang opened this issue · 2 comments

as for the quesiton:
Requirements
GPU is not used and is not required.

I have gpu and I expect to run model faster, but your code is only for cpu? how to change it?

Until I make some updates, check out this fork
https://github.com/sebaxzero/LangChain_PDFChat_Oobabooga

I haven't tried it myself, but looks like it should be capable to utilize GPU.

You could try this approach!
Haven't tested it yet though

GPU Based Approach

llm = LlamaCpp(
    model_path="./models/stable-vicuna-13B.ggmlv3.q8_0.bin",
    stop=["### Human:"],
    callback_manager=callback_manager,
    verbose=True,
    n_ctx=2048,
    n_batch=512,
    device="cuda"  # Specify the GPU device (e.g., "cuda:0")
)