intel/intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
PythonApache-2.0
Issues
- 8
Error in import: ModuleNotFoundError: No module named 'neural_compressor.conf'
#1695 opened by Nicogs43 - 0
- 0
INSTALL ERRORS PLS HELP!!!
#1699 opened by c2rx - 1
After fine-tuning qwen2-1.5B-instruction and quantifying its AWQ, an error occurred while using Intel Extension for Transformers and CPU for inference. But when I used the same method to fine tune and quantify qwen1.5-4B chat before, I could use Intel Extension for Transformers to accelerate CPU inference. 对qwen2-1.5B-instruct微调并且awq量化后,使用intel-extension-for-transformers和CPU进行推理时出错。但我之前使用同样的方式微调和量化的qwen1.5-4B-chat时是可以使用intel-extension-for-transformers加速CPU推理的。
#1697 opened by Autism-al - 0
W4A16 LLaMA3 8B quantized model inference failed
#1694 opened by AustinJiangg - 7
- 3
intel-extension-for-transformers.ipynb
#1696 opened by ayttop - 0
Improve Qlora docs & finetune qwen2-0.5b instruct example
#1692 opened by bil-ash - 1
core dumped
#1690 opened by zwx109473 - 1
Rag example not working
#1688 opened by anayjain - 0
Segmentation fault while running rag mode
#1681 opened by anayjain - 3
Python3.11: Could not build wheels for cchardet, which is required to install pyproject.toml-based projects
#1469 opened by bbelky - 1
- 7
- 1
[Feature request] Support for Flashattention 3
#1665 opened by sleepingcat4 - 0
AutoModelForCausalLM model.generate Wrong response by docker run the same chatglm3-int4 model bin file
#1680 opened by ahlwjnj - 2
ImportError: cannot import name 'WeightOnlyQuantizedLinear' from 'intel_extension_for_pytorch.nn.utils._quantize_convert'
#1630 opened by junruizh2021 - 0
evaluation Parameter Parsing problem
#1676 opened by 1826133674 - 3
question about configuration
#1643 opened by menglin0320 - 2
- 0
- 3
- 5
Compatibility for other platforms, AMD, etc.
#1649 opened by rain7996 - 2
ITREX: release torch 2.3.x version
#1644 opened by casper-hansen - 8
Fails to load saved model : Trying to set a tensor of shape torch.Size([1376, 4096]) in "qweight" (which has shape torch.Size([4096, 1376])), this look incorrect.
#1407 opened by kranipa - 0
- 3
Support inference with WOQ and LoRA adapter
#1434 opened by Yuan0320 - 2
ModuleNotFoundError: No module named 'datasets'
#1461 opened by Aisuko - 3
talking bot backend for windows-pc is not working, notebook need to be updated
#1518 opened by raj-ritu17 - 4
- 4
- 6
Cannot finish FP4 quantization: `RuntimeError: Qbits: only support Integer WOQ in PACKQ`
#1577 opened by PhzCode - 4
Whether FP4 inference is supported
#1582 opened by PhzCode - 17
Cannot run llama3 8b instruct: `AssertionError: Fail to convert pytorch model`
#1522 opened by N3RDIUM - 1
qloracpu fails, need a conda env list
#1561 opened by Lix1993 - 2
(detailed) conda install instructions?
#1550 opened by hpcpony - 5
unable to start talkingbot frondend
#1517 opened by raj-ritu17 - 1
rag plugin initialize failed
#1538 opened by redhairerINTEL - 3
neuralchat /v1/askdoc/create 404 not found. Failed to call this api on ubuntu system.
#1533 opened by RongLei-intel - 2
pip install failure on python3.10-alpine image
#1379 opened by lrrountr - 3
- 4
- 1
NeuralChat TTS plugin unable to initialize due to missing dependency: librosa
#1490 opened by alexsin368 - 2
- 1
RAG example not working..
#1464 opened by guytamir - 3
- 2
- 7
Requirements.txt underscores instead of dashes
#1421 opened by anthony-intel - 2
failed to create the serving
#1392 opened by RongLei-intel - 3
SageMaker does not support Transformers 4.34.1 which is required by ITREX
#1381 opened by eduand-alvarez