Issues
- 2
Segmentation fault (core dumped) FIRST_TOKEN_WEIGHT_LOCATION=$1 NEXT_TOKEN_WEIGHT_LOCATION=$2 OMP_NUM_THREADS=$3 numactl -C $cpu_index -p $2 $BENCHMARK
#473 opened by LittleNoob2333 - 1
Update Broken QR Code Link for WeChat on Wiki Page
#472 opened by w1ida - 0
oneDNN and XFT performance
#471 opened by LittleNoob2333 - 1
- 1
about benchmark Illegal instruction
#469 opened by LittleNoob2333 - 5
qwen1.5-32b long text input issue
#411 opened by zhm-algo - 3
- 3
[request]qwen1 not supported by vllm-xft
#447 opened by zhm-algo - 0
Crash when using CB mode with multi-rank
#440 opened by a3213105 - 3
gcc8.2 编译报错
#419 opened by bukejiyu - 2
support vllm?
#250 opened by leiwen83 - 2
llama-2-70b memory usage
#403 opened by zhm-algo - 1
Can we summarize the meanings of data type like bf16_fp16?, for example, what's activation data type and output data type, what's the computing instruction?
#414 opened by heagoo - 7
error quantization with AWQ + AutoGPTQ
#312 opened by zhm-algo - 1
chatglm3 6b error
#335 opened by zhm-algo - 1
performance issue for opt-1.3b with BS=1 BF16
#339 opened by bin1guo - 1
[output issue] found mistakes in llama-3-70b output by bf16_int4 during benchmark
#413 opened by intelyoungway - 1
BF16_INT4 model loading too slow
#235 opened by intelyoungway - 4
What's the meaning of bf16_int4 datatype?
#395 opened by LeiZhou-97 - 1
llama-2-7B benchmarking error with chinese prompts
#380 opened by qdym188 - 3
The current commit(d666741) of xFT make failed.
#345 opened by xiuying1 - 1
- 2
- 2
- 2
- 13
4 sockets qwen execute question
#269 opened by Storm0921 - 4
- 5
Illegal instruction (core dumped)
#247 opened by wswsmao - 1
typo in benchmark script
#255 opened by sssssux - 3
torch==2.2.0 run error
#233 opened by Zjq9409 - 1
Error install xfastertransformer on CentOS 7.6
#244 opened by bin1guo - 2
- 0
[BUG] PR #224 cause the wrong output of QWEN-14B
#239 opened by a3213105 - 3
[bug] library of Intel level-zero not found
#134 opened by intelyoungway - 2
Qwen Segmentation Fault after logN PR merged.
#229 opened by Duyi-Wang - 5
xft + sample output result look bad
#209 opened by Zjq9409 - 1
baichuan-7b run core dump
#210 opened by Zjq9409 - 4
Use mpirun to run benchmark.py get error
#153 opened by yangkunx - 8
Illegal instruction (core dumped)
#186 opened by Storm8878 - 0
QWEN14B will generate error output when multi queries with long input tokens.
#195 opened by a3213105 - 2
- 1
[Model] Convert HF Qwen model into FP16 data type
#142 opened by changqi1 - 7
Qwen-14B-Chat转换问题
#185 opened by Storm0921 - 0
encounter Qwen 72B stop id issue
#188 opened by marvin-Yu - 3
这块CPU是否满足xFasterTF的最低要求?
#182 opened by Storm0921 - 2
AMX_int8 not really be used when dtype="int8"
#175 opened by yiangyang - 3
core per numa calculation error
#152 opened by zhm-algo - 0
- 2
SHM reduceAdd performance issue on HBM with 2 sockets
#154 opened by abenmao - 1