intel-analytics/ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc

PythonApache-2.0

Issues

Vulnerability issue CVE-2024-31583 and CVE-2024-31580 on torch<2.2.0
#12380 opened 16 days ago by Johere
4
Error: llama runner process has terminated: error loading model: No device of requested type available
#12420 opened 6 days ago by fanlessfan
5
update to ollama 0.4.0
#12370 opened 19 days ago by Matthww
8
'AutoModel' object has no attribute 'config' when using Speech_Paraformer-Large on NPU
#12412 opened 11 days ago by fanyhchn
1
Update Ollama with IPEX-LLM to a newer version
#12411 opened 12 days ago by NikosDi
1
Brave Leo AI using Ollama and Intel GPU
#12248 opened a month ago by NikosDi
33
Path of models using Ollama with IPEX-LLM (Windows)
#12403 opened 10 days ago by NikosDi
4
Several GPU models behave erratically compared to CPU execution
#12374 opened 16 days ago by pepijndevos
2
using both iGPU and CPU together
#12373 opened 17 days ago by fanlessfan
6
Container cannot see Arc GPU
#12372 opened 17 days ago by robertvazan
12
Llama-3.2 11B Vision not working with latest IPEX-LLM (vLLM version 0.6.2)
#12391 opened 14 days ago by HumerousGorgon
3
assert error use ipex pytorch
#12385 opened 13 days ago by piDack
4
llava-hf/llava-1.5-7b-hf: error when multi-turn chat with multi-images
#12288 opened a month ago by Johere
6
Could not use SFT Trainer in qlora_finetuning.py
#12356 opened 20 days ago by shungyantham
11
ollama run minicpm-v, runs on cpu
#12257 opened a month ago by juan-OY
4
Docker - llama.cpp scripts / init-llama-cpp
#12379 opened 13 days ago by easyfab
2
ollama generate incorrect answer when it run glm-4-9b-chat model
#12193 opened a month ago by junruizh2021
5
cant run ollama in docker container with iGPU in linux
#12363 opened 20 days ago by user7z
8
performance problem about internvl image embedding using ggml.dll
#12376 opened 16 days ago by cjsdurj
1
ipex-llm-cpp-xpu container
#12364 opened 19 days ago by user7z
3
ValueError: If `eos_token_id` is defined, make sure that `pad_token_id` is defined
#12371 opened 17 days ago by fanlessfan
1
Ollama run embedding module mxbai-embed-large failed.
#12348 opened 19 days ago by feiyu11859661
3
How to check GPU memory consumption by ipex on Linux?
#12315 opened 20 days ago by acane77
1
A770运行 ipex_llm harness 跑chatglm3-6b 出现Error Message: property 'pad_token' of 'ChatGLMTokenizer' object has no setter
#12335 opened 22 days ago by tao-ov
1
Doubts about ParallelTable and ParallelCriterion
#12278 opened a month ago by clare-cn
0
ipex-llm xpu version doesn't work on Lunar Lake
#12268 opened a month ago by HoppeDeng
2
IPEX-LLM load qwen2.5 7B model failed
#12273 opened a month ago by HoppeDeng
1
llama.cpp crashes running k-quants with Intel Arc 140V Xe2 iGPU
#12318 opened 21 days ago by lhl
4
ipex-llm-ollama-installer-20240918.exe安装后用另一个exe调用文件夹中的start.bat会提示缺少dll等无法运行
#12334 opened 22 days ago by dayskk
1
[NPU] Typo in npu_model.py causes error when perform load_low_bit function
#12343 opened 21 days ago by climh
1
[ipex-llm] A significant deviation in accuracy between ipex llm 2.2.0b1 and 2.1.0b20240515 when running the codegeex model
#12294 opened a month ago by johnysh
1
A770 run harness.RuntimeError: unsupported dtype, only fp32 and fp16 are supported
#12304 opened 23 days ago by tao-ov
4
run harness on A770 error
#12290 opened a month ago by tao-ov
6
Questions about performance gap between benchmark scripts and llama-bench from ipex-llm[cpp]
#12280 opened a month ago by acane77
3
Slow text generation on dual Arc A770's w/ vLLM
#12190 opened a month ago by HumerousGorgon
36
Can't run llama model on Intel GPU on Linux platforms
#12222 opened a month ago by acane77
5
[NPU] Slow Token Generation with Latest NPU Driver 32.0.100.3053 on LNL 226V series
#12266 opened a month ago by climh
0
Llamacpp generation incoherent (always <eos>). Driver version on ubuntu 22.04.5?
#12258 opened a month ago by ultoris
1
Codegeex Nano model runs benchmark slower than qwen1.5-1.8B(0.5 times)
#12254 opened a month ago by johnysh
0
Ubuntu 22.04 Kernel 6.8.0-45 cannot install intel-i915-dkms
#12156 opened 2 months ago by huiwangnick
4
Ollama error after a few requests - ubatch must be set as the times of VS
#12216 opened a month ago by mateHD
6
Slow Down / Pod Stuck After Orca Init, Resolves with Barrier Mode False
#12220 opened a month ago by kahlun
0
LLM benchmark for chatglm3-6b Unable to run properly
#12208 opened a month ago by vincent-wsz
1
docker container cannot run Qwen2.5 32b awq int4 quantization model （OOM）
#12228 opened a month ago by sarsmlee
1
Inference hangs on LNL iGPU with large input prompts.
#12158 opened a month ago by aahouzi
5
Qwen2 Deployment by Ollama fail
#12210 opened a month ago by vincent-wsz
2
Output format improvement
#12205 opened a month ago by HoppeDeng
2
Unsupported SPIR-V version
#12168 opened a month ago by Pablou2902
4
Illegal Instruction (Core Dumped) when Running Model with CPU Docker Image
#12184 opened a month ago by vauns
4
Ollama fails to load model to a380 GPU
#12172 opened 2 months ago by GamerSocke
2