Yale-LILY/SummEval

Unittests fail

Opened this issue · 2 comments

Hi,

I installed the package from source and applied all tests. Unfortunately, except for [meteor, cider, bleu, bert_score (and ROUGE which I haven't installed yet)], all other tests failed. They finished the run, but the output score differed from the reference score.

Can you please check it out?

Hi!

Can you share the output of pip freeze with me so I can see your env and compare on my end?
Thanks!

Thank you for your answer!
Attached is the 'pip list' output (the pip freeze output contains many conda paths, so I thought this one would be better).

Package Version


absl-py 1.2.0
accelerate 0.11.0
aiohttp 3.8.1
aiosignal 1.2.0
aniso8601 9.0.1
ansi2html 1.8.0
anyio 3.6.1
argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0
arrow 1.2.2
asttokens 2.0.5
async-timeout 4.0.2
attrs 21.4.0
awscli 1.22.101
Babel 2.10.3
backcall 0.2.0
backports.functools-lru-cache 1.6.4
beautifulsoup4 4.11.1
bert-score 0.3.11
blanc 0.2.7
bleach 5.0.1
blis 0.7.8
bokeh 2.4.3
boto3 1.21.46
botocore 1.24.46
Brotli 1.0.9
brotlipy 0.7.0
captum 0.5.0
catalogue 2.0.6
certifi 2022.6.15
cffi 1.15.1
charset-normalizer 2.1.0
click 7.1.2
cloudpickle 2.1.0
colorama 0.4.3
confection 0.0.3
cryptography 37.0.4
cycler 0.11.0
cymem 2.0.6
Cython 0.29.32
dataclasses 0.8
datasets 2.3.2
debugpy 1.6.0
decorator 5.1.1
defusedxml 0.7.1
dill 0.3.5.1
docutils 0.15.2
dparse 0.5.1
emoji 2.1.0
en-core-web-sm 3.4.0
entrypoints 0.4
executing 0.8.3
fastai 2.1.10
fastcore 1.5.6
fastjsonschema 2.16.1
fastprogress 1.0.2
filelock 3.6.0
Flask 2.1.3
Flask-RESTful 0.3.9
flit_core 3.7.1
fonttools 4.34.4
frozenlist 1.3.0
fsspec 2022.5.0
future 0.18.2
gin-config 0.5.0
google-pasta 0.2.0
gym 0.23.1
gym-notices 0.0.7
horovod 0.25.0
huggingface-hub 0.8.1
idna 3.3
imageio 2.16.2
importlib-metadata 4.11.4
importlib-resources 5.8.0
inflate64 0.1.4
ipykernel 6.15.1
ipython 8.4.0
ipython-genutils 0.2.0
itsdangerous 2.1.2
jedi 0.18.1
Jinja2 3.1.2
jmespath 1.0.1
joblib 1.1.0
json5 0.9.5
jsonschema 4.7.2
jupyter-client 7.3.4
jupyter_core 4.11.0
jupyter-server 1.18.1
jupyterlab 3.3.4
jupyterlab-pygments 0.2.2
jupyterlab-server 2.15.0
kiwisolver 1.4.4
langcodes 3.3.0
llvmlite 0.38.1
lxml 4.9.1
MarkupSafe 2.1.1
matplotlib 3.5.2
matplotlib-inline 0.1.3
mistune 0.8.4
moverscore 1.0.3
multidict 6.0.2
multiprocess 0.70.13
multivolumefile 0.2.3
munkres 1.1.4
murmurhash 1.0.7
names 0.3.0
nbclassic 0.4.3
nbclient 0.6.6
nbconvert 6.5.0
nbformat 5.4.0
nest-asyncio 1.5.5
networkx 2.8.7
nltk 3.7
notebook-shim 0.1.0
numba 0.55.2
numpy 1.22.4
nvgpu 0.9.0
packaging 21.3
pandas 1.4.3
pandocfilters 1.5.0
parso 0.8.3
pathy 0.6.2
patsy 0.5.2
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.0.1
pip 22.1.2
plac 0.9.6
plotly 5.6.0
ply 3.11
portalocker 2.5.1
preshed 3.0.6
prometheus-client 0.14.1
prompt-toolkit 3.0.30
protobuf 3.20.0
protobuf3-to-dict 0.1.5
psutil 5.9.1
ptyprocess 0.7.0
pure-eval 0.2.2
py7zr 0.19.0
pyarrow 8.0.0
pyasn1 0.4.8
pybcj 0.6.1
pybind11 2.9.2
pybind11-global 2.9.2
pycparser 2.21
pycryptodomex 3.15.0
pydantic 1.7.4
pyemd 0.5.1
pyfunctional 1.4.3
pygame 2.1.2
Pygments 2.12.0
pynvml 11.4.1
pyOpenSSL 22.0.0
pyparsing 3.0.9
pyppmd 0.18.3
PyQt5 5.15.7
PyQt5-sip 12.11.0
pyrsistent 0.18.1
PySocks 1.7.1
python-dateutil 2.8.2
pytorch-pretrained-bert 0.6.2
pytz 2022.1
PyYAML 5.4.1
pyzmq 23.2.0
pyzstd 0.15.2
regex 2022.7.9
requests 2.28.1
responses 0.18.0
rouge-score 0.1.2
rsa 4.7.2
s3fs 0.4.2
s3transfer 0.5.2
sacrebleu 2.2.1
sacremoses 0.0.53
safety 1.10.3
sagemaker 2.22.0
scikit-learn 1.0
scipy 1.8.1
seaborn 0.11.2
Send2Trash 1.8.0
sentencepiece 0.1.96
setuptools 63.2.0
shap 0.40.0
shellingham 1.4.0
sip 6.6.2
six 1.16.0
sklearn 0.0
slicer 0.0.7
smart-open 5.2.1
smclarify 0.2
smdebug-rulesconfig 1.0.0
sniffio 1.2.0
soupsieve 2.3.2.post1
spacy 3.4.1
spacy-legacy 3.0.9
spacy-loggers 1.0.2
srsly 2.4.4
stack-data 0.3.0
stanza 1.4.2
statsmodels 0.13.2
summ-eval 0.892
tabulate 0.8.10
tenacity 8.0.1
termcolor 1.1.0
terminado 0.15.0
texttable 1.6.4
thinc 8.1.3
threadpoolctl 3.1.0
tinycss2 1.1.1
tokenizers 0.10.3
toml 0.10.2
torch 1.12.0
torch-model-archiver 0.5.3b20220226
torch-workflow-archiver 0.2.4b20220513
torchaudio 0.12.0
torchserve 0.6.0b20220513
torchtext 0.13.0
torchvision 0.13.0
tornado 6.2
tqdm 4.63.2
traitlets 5.3.0
transformers 4.15.0
typer 0.3.2
typing 3.7.4.3
typing_extensions 4.3.0
unicodedata2 14.0.0
urllib3 1.26.10
wasabi 0.9.1
wcwidth 0.2.5
webencodings 0.5.1
websocket-client 1.3.3
Werkzeug 2.1.2
wheel 0.37.1
wmd 1.3.2
xxhash 3.0.0
yarl 1.7.2
zipp 3.8.0