intel/xFasterTransformer

C++Apache-2.0

Issues

Segmentation fault (core dumped) FIRST_TOKEN_WEIGHT_LOCATION=$1 NEXT_TOKEN_WEIGHT_LOCATION=$2 OMP_NUM_THREADS=$3 numactl -C $cpu_index -p $2 $BENCHMARK
#473 opened 3 months ago by LittleNoob2333
2
Update Broken QR Code Link for WeChat on Wiki Page
#472 opened 4 months ago by w1ida
1
oneDNN and XFT performance
#471 opened 4 months ago by LittleNoob2333
0
[bug] HBM flat QUAD mode determination method is incorrect
#446 opened 4 months ago by xuyizhan
1
about benchmark Illegal instruction
#469 opened 4 months ago by LittleNoob2333
1
qwen1.5-32b long text input issue
#411 opened 5 months ago by zhm-algo
5
[run_benchmark.sh] Few cores are running on HBM when batch-size >16 or 32
#455 opened 5 months ago by hangfu-guo
3
[request]qwen1 not supported by vllm-xft
#447 opened 5 months ago by zhm-algo
3
Crash when using CB mode with multi-rank
#440 opened 5 months ago by a3213105
0
gcc8.2 编译报错
#419 opened 5 months ago by bukejiyu
3
support vllm?
#250 opened 6 months ago by leiwen83
2
llama-2-70b memory usage
#403 opened 6 months ago by zhm-algo
2
Can we summarize the meanings of data type like bf16_fp16?, for example, what's activation data type and output data type, what's the computing instruction?
#414 opened 6 months ago by heagoo
1
error quantization with AWQ + AutoGPTQ
#312 opened 6 months ago by zhm-algo
7
chatglm3 6b error
#335 opened 7 months ago by zhm-algo
1
performance issue for opt-1.3b with BS=1 BF16
#339 opened 7 months ago by bin1guo
1
[output issue] found mistakes in llama-3-70b output by bf16_int4 during benchmark
#413 opened 6 months ago by intelyoungway
1
BF16_INT4 model loading too slow
#235 opened 6 months ago by intelyoungway
1
What's the meaning of bf16_int4 datatype?
#395 opened 6 months ago by LeiZhou-97
4
llama-2-7B benchmarking error with chinese prompts
#380 opened 6 months ago by qdym188
1
The current commit(d666741) of xFT make failed.
#345 opened 7 months ago by xiuying1
3
Build Error: Failure to Download and Configure xdnn_lib
#341 opened 7 months ago by Damonpkl
1
qwen2支持吗
#300 opened 7 months ago by wzg-zhuo
2
[bug] Met some problems while following *step by step tutorial*
#328 opened 7 months ago by lum1n0us
2
[bug] Need to install protobuf library when run benchmark in docker
#282 opened 8 months ago by xuyizhan
2
4 sockets qwen execute question
#269 opened 8 months ago by Storm0921
13
xft will be blocked when MPI + QWEN14B + do_sample=true
#204 opened 8 months ago by a3213105
4
Illegal instruction (core dumped)
#247 opened 8 months ago by wswsmao
5
typo in benchmark script
#255 opened 9 months ago by sssssux
1
torch==2.2.0 run error
#233 opened 9 months ago by Zjq9409
3
Error install xfastertransformer on CentOS 7.6
#244 opened 9 months ago by bin1guo
1
xft + QWEN14B + fp16 got unexpected outputs compared with bf16_fp16.
#220 opened 9 months ago by a3213105
2
[BUG] PR #224 cause the wrong output of QWEN-14B
#239 opened 9 months ago by a3213105
0
[bug] library of Intel level-zero not found
#134 opened 9 months ago by intelyoungway
3
Qwen Segmentation Fault after logN PR merged.
#229 opened 9 months ago by Duyi-Wang
2
xft + sample output result look bad
#209 opened 9 months ago by Zjq9409
5
baichuan-7b run core dump
#210 opened 10 months ago by Zjq9409
1
Use mpirun to run benchmark.py get error
#153 opened 10 months ago by yangkunx
4
Illegal instruction (core dumped)
#186 opened 10 months ago by Storm8878
8
QWEN14B will generate error output when multi queries with long input tokens.
#195 opened 10 months ago by a3213105
0
[Model] QWen14B-Chat got wrong output when input tokens is too long
#174 opened 10 months ago by a3213105
2
[Model] Convert HF Qwen model into FP16 data type
#142 opened 10 months ago by changqi1
1
Qwen-14B-Chat转换问题
#185 opened 10 months ago by Storm0921
7
encounter Qwen 72B stop id issue
#188 opened 10 months ago by marvin-Yu
0
这块CPU是否满足xFasterTF的最低要求？
#182 opened 10 months ago by Storm0921
3
AMX_int8 not really be used when dtype="int8"
#175 opened 10 months ago by yiangyang
2
core per numa calculation error
#152 opened 10 months ago by zhm-algo
3
chatGLM2-6B crash while running 4 ranks with the datatype w8a8
#155 opened 10 months ago by shanzhou2186
0
SHM reduceAdd performance issue on HBM with 2 sockets
#154 opened a year ago by abenmao
2
[bug] Segmentation fault occurs at large batch sizes
#140 opened a year ago by aurora327
1