A quick question about BLUERT and Tools
Closed this issue · 2 comments
First of all, congratulations on your work being accepted by TACL!
I have some questions:
- Implementation of the BLEURT metric
I directly downloaded the evaluation model from the official BLEURT repository and installed the corresponding packages. Using the example code, I evaluated a translation, as follows:
`from bleurt import score
references_list = read('wmt22.en-zh.zh')
candidates_list = read('wmt22.en-zh.zh.maps.0-seed.trans')
checkpoint = "bleurt/bleurt/test_checkpoint"
scorer = score.BleurtScorer(checkpoint)
scores = scorer.score(references=references_list, candidates=candidates_list)
average_score = sum(scores) / len(scores)
print("Average BLEURT score:", average_score)`
However, the score was only 0.57. Is this evaluation process consistent with the one in your paper? Could there be something I've overlooked that resulted in this poor score?
- Graphical Tools
Additionally, I am very curious about which tools you used to create the charts in your paper.
Thank you!
- Here is a minimal script for BLEURT evaluation:
from bleurt import score as bleurt_score
def readlines(file_path):
if not file_path:
return []
with open(file_path, 'r') as f:
lines = f.readlines()
return [l.strip() for l in lines]
references_list = readlines('data/raw/wmt22.en-zh.zh')
candidates_list = readlines('output/text-davinci-003/wmt22.en-zh.zh.maps.0-seed.trans')
checkpoint = "eval_ckpt/BLEURT-20"
bleurt_model = bleurt_score.LengthBatchingBleurtScorer(checkpoint)
scores = bleurt_model.score(references=references_list, candidates=candidates_list, batch_size=2)
average_score = sum(scores) / len(scores)
print("Average BLEURT score:", average_score)
2024-01-23 11:45:06.430392: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-01-23 11:45:06.471530: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-23 11:45:06.471604: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-23 11:45:06.471645: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-23 11:45:06.479669: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-01-23 11:45:06.479899: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-01-23 11:45:07.332096: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-01-23 11:45:09.753923: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2211] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1019/1019 [08:42<00:00, 1.95it/s]
Average BLEURT score: 0.7258136263286359
- I only use Keynote.
The problem has been solved, thank you very much for your answer!