- Create
models
directory and download all the models in it - Create
datasets/botnoi-api/
directory and downloadbotnoi_api_cleaned_v2.xlsx
in it
pip install -r requirements.txt
Start bert serving server somewhere
bert-serving-start [-h] -model_dir MODEL_DIR
[-tuned_model_dir TUNED_MODEL_DIR]
[-ckpt_name CKPT_NAME] [-config_name CONFIG_NAME]
[-graph_tmp_dir GRAPH_TMP_DIR]
[-max_seq_len MAX_SEQ_LEN]
[-pooling_layer POOLING_LAYER [POOLING_LAYER ...]]
[-pooling_strategy {NONE,REDUCE_MAX,REDUCE_MEAN,REDUCE_MEAN_MAX,FIRST_TOKEN,LAST_TOKEN}]
[-mask_cls_sep] [-show_tokens_to_client]
[-port PORT] [-port_out PORT_OUT]
[-http_port HTTP_PORT]
[-http_max_connect HTTP_MAX_CONNECT] [-cors CORS]
[-num_worker NUM_WORKER]
[-max_batch_size MAX_BATCH_SIZE]
[-priority_batch_size PRIORITY_BATCH_SIZE] [-cpu]
[-xla] [-fp16]
[-gpu_memory_fraction GPU_MEMORY_FRACTION]
[-device_map DEVICE_MAP [DEVICE_MAP ...]]
[-prefetch_size PREFETCH_SIZE] [-verbose]
[-version]
Open another terminal window then run
python embed.py --model_lang [thai, multi] --ckpt [int]
embed numpy array will be save at datasets/botnoi-api/
See evaluate.ipynb