intel-analytics/ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc

PythonApache-2.0

Issues

Several GPU models behave erratically compared to CPU execution
#12374 opened a month ago by pepijndevos
13
[torch 2.3 + bigdl-core-xe-23] AttributeError: module 'xe_linear' has no attribute 'forward_qkv'
#12506 opened 18 days ago by Nuullll
2
How can I use NPU to run Ollama on Intel Ultra7 155H chip in a laptop?
#12504 opened 18 days ago by Muzixin
5
update to ollama 0.4.0
#12370 opened a month ago by Matthww
11
LLaVA-Video-7B-Qwen2 int4 quantization enabling on ARC
#12482 opened 20 days ago by zhangcong2019
0
Encounter error when running Qwen2-VL in ipex-llm processing input video with large frame number
#12469 opened 18 days ago by zhangcong2019
3
Error while deserializing header: HeaderTooLarge
#12492 opened 19 days ago by lvjingax
1
Using bf16 for inference on a CPU is slower than using float32.
#12472 opened 21 days ago by fousdfrf
9
AssertionError: Speculative decoding not yet supported for XPU backend
#12463 opened 19 days ago by HiddenPeak
12
IPEX serving docker can not use asym_int4 sym_int4 fp4 mixed_fp4
#12464 opened 20 days ago by HiddenPeak
6
Error: llama runner process has terminated: exit status 127
#12471 opened 23 days ago by NikosDi
2
Kernel NULL pointer dereference in i915 driver
#12435 opened a month ago by luhuaei
3
init-ollama.bat Not Working
#12465 opened 22 days ago by imabdul-dev
5
Disable XMX
#12426 opened 25 days ago by NikosDi
8
Request to upgrade "Langchain-Chatchat" based on the latest version in github.
#12448 opened a month ago by liang1wang
2
Container cannot see Arc GPU
#12372 opened a month ago by robertvazan
12
Inference is exceptionally slow on the L20 GPU
#12440 opened a month ago by joey9503
1
Error: llama runner process has terminated: error loading model: No device of requested type available
#12420 opened a month ago by fanlessfan
10
Unable to inference with Qwen2.5 GPTQ model
#12432 opened a month ago by notsyncing
3
llama.cpp crashes running k-quants with Intel Arc 140V Xe2 iGPU
#12318 opened 2 months ago by lhl
5
Error loading for file torch\lib\backend_with_compiler.dll
#12428 opened a month ago by LiangtaoJin
1
nf4 still unsupported?
#12427 opened a month ago by epage480
1
Vulnerability issue CVE-2024-31583 and CVE-2024-31580 on torch<2.2.0
#12380 opened a month ago by Johere
4
'AutoModel' object has no attribute 'config' when using Speech_Paraformer-Large on NPU
#12412 opened a month ago by fanyhchn
1
Update Ollama with IPEX-LLM to a newer version
#12411 opened a month ago by NikosDi
1
Path of models using Ollama with IPEX-LLM (Windows)
#12403 opened a month ago by NikosDi
4
using both iGPU and CPU together
#12373 opened a month ago by fanlessfan
6
Llama-3.2 11B Vision not working with latest IPEX-LLM (vLLM version 0.6.2)
#12391 opened a month ago by HumerousGorgon
3
assert error use ipex pytorch
#12385 opened a month ago by piDack
4
llava-hf/llava-1.5-7b-hf: error when multi-turn chat with multi-images
#12288 opened 2 months ago by Johere
6
Could not use SFT Trainer in qlora_finetuning.py
#12356 opened 2 months ago by shungyantham
11
Docker - llama.cpp scripts / init-llama-cpp
#12379 opened a month ago by easyfab
2
cant run ollama in docker container with iGPU in linux
#12363 opened 2 months ago by user7z
8
performance problem about internvl image embedding using ggml.dll
#12376 opened a month ago by cjsdurj
1
ipex-llm-cpp-xpu container
#12364 opened a month ago by user7z
3
ValueError: If `eos_token_id` is defined, make sure that `pad_token_id` is defined
#12371 opened a month ago by fanlessfan
1
Ollama run embedding module mxbai-embed-large failed.
#12348 opened a month ago by feiyu11859661
3
How to check GPU memory consumption by ipex on Linux?
#12315 opened 2 months ago by acane77
1
A770运行 ipex_llm harness 跑chatglm3-6b 出现Error Message: property 'pad_token' of 'ChatGLMTokenizer' object has no setter
#12335 opened 2 months ago by tao-ov
1
Doubts about ParallelTable and ParallelCriterion
#12278 opened 2 months ago by clare-cn
0
ipex-llm xpu version doesn't work on Lunar Lake
#12268 opened 2 months ago by HoppeDeng
2
IPEX-LLM load qwen2.5 7B model failed
#12273 opened 2 months ago by HoppeDeng
1
ipex-llm-ollama-installer-20240918.exe安装后用另一个exe调用文件夹中的start.bat会提示缺少dll等无法运行
#12334 opened 2 months ago by dayskk
1
[NPU] Typo in npu_model.py causes error when perform load_low_bit function
#12343 opened 2 months ago by climh
1
[ipex-llm] A significant deviation in accuracy between ipex llm 2.2.0b1 and 2.1.0b20240515 when running the codegeex model
#12294 opened 2 months ago by johnysh
1
A770 run harness.RuntimeError: unsupported dtype, only fp32 and fp16 are supported
#12304 opened 2 months ago by tao-ov
4
run harness on A770 error
#12290 opened 2 months ago by tao-ov
6
Questions about performance gap between benchmark scripts and llama-bench from ipex-llm[cpp]
#12280 opened 2 months ago by acane77
3
[NPU] Slow Token Generation with Latest NPU Driver 32.0.100.3053 on LNL 226V series
#12266 opened 2 months ago by climh
0
Llamacpp generation incoherent (always <eos>). Driver version on ubuntu 22.04.5?
#12258 opened 2 months ago by ultoris
1