vecto-ai/vecto

Can not run intrinsic evaluation

flyaway1217 opened this issue · 6 comments

When I used the command line like this:
python3 -m vecto.benchmarks.analogy /path/to/config_analogy.yaml

It raised the following error:
No module named vecto.benchmarks.analogy.__main__; 'vecto.benchmarks.analogy' is a package and cannot be directly executed

When I tried to evaluate from the code, like:

path_model = "./test/data/embeddings/text/plain_no_file_header"
model = vecto.model.load_from_dir(path_model)
options = {}
options["path_dataset"] = "./test/data/benchmarks/analogy/"
options["path_results"] = "/tmp/vecto/analogy"
options["name_method"] = "3CosAdd"
vecto.benchmarks.analogy.analogy.run(model, options)

It raised an error: AttributeError: module 'vecto' has no attribute 'model'

Am I doing it right? I can not evaluate on either methods.

Hi, flyaway1217

Sorry that we changed our APIs recently. Please try command:

python3 -m vecto benchmarks analogy /path/to/config_analogy.yaml

(replacing '.' into ' ')

As for the "evaluate from the code" part, please check the sample code in "./tests/benchmarks/test_analogy.py" :)

Thanks for the reply.
However, when I run:
python3 -m vecto benchmark analogy config.yaml

It raises another error:

running  analogy
['config.yaml']
usage: __main__.py [-h] [--method METHOD] [--path_out PATH_OUT]
                   embeddings dataset
__main__.py: error: the following arguments are required: dataset

I have provided the path of dataset in the config.yaml.
Should I provide the dataset in the command line?

Thanks for the reply.
However, when I run:
python3 -m vecto benchmark analogy config.yaml

It raises another error:

running  analogy
['config.yaml']
usage: __main__.py [-h] [--method METHOD] [--path_out PATH_OUT]
                   embeddings dataset
__main__.py: error: the following arguments are required: dataset

I have provided the path of dataset in the config.yaml.
Should I provide the dataset in the command line?

Oh, indeed. An example would be

python3 -m vecto benchmark analogy path_embedding path_dataset
--path_out /tmp/vecto/benchmarks/
--method 3CosAdd

Thanks for the help ! I can run it successfully.
However I have some questions:

  1. The path_embedding and path_dataset seems to have to be a directory? I assume it will evaluate each embedding in the path_embedding on each dataset in path_dataset, am I right?
  2. If the first comment is true, I think it would be nice to ignore some hidden filee, for example, .DS_Store in Mac system. Now, i have to manually delete .DS_Store to make this code run.
  3. How long would it be take to evaluate BATS dataset ? I have started evaluation process like 10 minutes ago and it does not finish yet.

This is a nice tool. Thanks for providing this!

Hi, flyaway1217,

Thanks for your attention.

  1. the path_dataset will read all files (datasets). However, the path_embedding only evaluate a single embedding model.
  2. Sure, indeed.
  3. I'm not quite sure depending on various hardwares. However, I highly recommend you to try LRCos methods. This methods runs a lot faster than 3CosAdd.