intel/neural-speed

error running inference

Closed this issue · 3 comments

HI
I was ruuning
NEURAL_SPEED_VERBOSE=1 python scripts/run.py /home/rachels.dell/confluence-search/intel_neural_chat --weight_dtype int4 -p "She opened the door and see"
this ran as expected
Then I ran
OMP_NUM_THREADS=128 numactl -m 0 -C 0-100 python scripts/inference.py --model_name mistral -m mistral_files/ne_mistral_int4.bin -c 512 -b 1024 -n 256 -t 128 --color -p "She opened the door and see"

and I get

Namespace(model_name='mistral', model=PosixPath('mistral_files/ne_mistral_int4.bin'), build_dir=PosixPath('/home/rachels.dell/neural-speed/scripts/../build'), prompt='She opened the door and see', tokenizer='THUDM/chatglm-6b', n_predict=256, threads=128, batch_size_truncate=1024, ctx_size=512, seed=-1, repeat_penalty=1.1, color=True, keep=0, shift_roped_k=False, memory_f32=False, memory_f16=False, memory_auto=False, one_click_run='False')
Please build graph first or select the correct model name.

any idea why?

Could be a similar cause of #88 🤔

Could be a similar cause of #88 🤔

I guess I didn't build it correctly because this solved the issue:

git submodule update --init --recursive
mkdir build
cd build
cmake .. -G Ninja
ninja

managed to solve this