intel/neural-speed

An innovative library for efficient LLM inference via low-bit quantization

C++Apache-2.0

Issues

Once upon a time, a little NE_ASSERT: /root/w0/workspace/neuralspeed-wheel-build/nlp_repo/neural_speed/core/ne_layers.c:2651: ne_nelements(a) == ne0 * ne1 * ne2
#326 opened 4 months ago by zwx109473
0
is it supported with Batch size >1 ?
#269 opened 5 months ago by QuPengfei
17
BF16 Compute DType on AVX512 ISA
#308 opened 6 months ago by Alavandar08
0
Yi-6B model failed to evaluate
#314 opened 5 months ago by jedcheng
1
Bestla Kernels understanding and benchmarking
#289 opened 6 months ago by Alavandar08
8
developer_document.md need elaboration on determining buffer sizes?
#287 opened 7 months ago by hpcpony
1
Whats the different with IPEX-LLM?
#290 opened 6 months ago by manfye
0
Performance on Xeon Scalable
#284 opened 7 months ago by regmibijay
1
Add support for phi3-vision
#268 opened 7 months ago by bil-ash
1
Loading checkpoint shards takes too long
#251 opened 8 months ago by irjawais
2
Is tensor parallelism supported by neural speed?
#220 opened 7 months ago by zhangnju
2
AssertionError: Fail to convert pytorch model
#194 opened 9 months ago by anthony-intel
3
Distributing tensors across NUMA nodes
#207 opened 9 months ago by shg8
3
Feature request: JSON mode output
#204 opened 9 months ago by eliranwong
1
heap-buffer-overflow while packing weight
#167 opened 7 months ago by yufenglee
2
Performance Gap between Neural Speed Matmul Operator and Llama.cpp Operator
#174 opened 9 months ago by aciddelgado
13
Modifying the models hyperparameters
#124 opened 10 months ago by benjamin27315k
8
Qwen2 GPTQ break in cpp_model.Model.np_bestla_qpack
#163 opened 7 months ago by yuchen2580
1
Error: Unable to install.
#257 opened 7 months ago by Ujjawal-K-Panchal
5
source build from release tar file?
#258 opened 7 months ago by hpcpony
1
Add support for phi-3-mini-128k model
#238 opened 7 months ago by bil-ash
4
Sycl support ?
#191 opened 7 months ago by rahulunair
1
context size of the model keeps fall to default of 512
#149 opened 7 months ago by RachelShalom
4
Garbled characters with beam search
#215 opened 8 months ago by jiafuzha
16
Linking back to Neural Chat / intel-extension-for-transformers
#237 opened 8 months ago by thealmightygrant
2
i wish for simpler way to run the model
#230 opened 8 months ago by kolinfluence
4
i saw how beautiful this repo is, in terms of parallelism / numa stuff etc.
#231 opened 8 months ago by kolinfluence
1
Issue in whisper inference from pre-converted gguf
#203 opened 8 months ago by bil-ash
2
Question about Thread pool and GEMV
#221 opened 8 months ago by chenhongyu2048
4
Low text quality when inferencing a gguf model with Neural Speed vs llama.cpp
#94 opened 8 months ago by aahouzi
5
Huge performance difference in "Transformer-like" usage and "llama.cpp-like" usage
#205 opened 8 months ago by Ankur-singh
2
Running Q4_K_M gguf models: unrecognized tensor type 12
#206 opened 9 months ago by shg8
1
baseline example not working
#193 opened 9 months ago by anthony-intel
1
Neural Speed compilation failing in ORT
#188 opened 9 months ago by sunnyshu-intel
3
What is the difference between ITREX and Neural Speed?
#155 opened 9 months ago by jeremyfowers
3
[Feature request] Add nllb support
#99 opened a year ago by bil-ash
4
Can't Load Qwen after support qwen2
#161 opened 9 months ago by kunger97
2
ModuleNotFoundError: No module named 'neural_speed.mistral_cpp'
#131 opened 10 months ago by santurini
7
Error loading model when use qwen gguf model
#96 opened a year ago by kunger97
7
Error at Colab Inference of neural-chat-7b-v3-1 Model
#90 opened 10 months ago by dopc
3
Error during - pip install . -
#111 opened a year ago by dellamuradario
2
Documentation for whisper inference
#104 opened a year ago by bil-ash
2
error running inference
#91 opened a year ago by RachelShalom
3
Can't inference Llama2 through GGUF
#88 opened a year ago by ZJkyle
2
is qwen been supported?
#77 opened a year ago by kunger97
1
Build failure when building the executable
#74 opened a year ago by aahouzi
3
AVX_VNNI Numeric Bug?
#32 opened a year ago by parvizmp
1