src-d/vecino

Tried to use it, failed miserably

campoy opened this issue · 4 comments

Trying to run vecino https://github.com/apache/spark will not work.

$ vecino https://github.com/apache/spark
WARNING:bblfsh:Failed to ensure that the Babelfish server is running.
INFO:id2vec:Reading /Users/francesc/.source{d}/id2vec/default.asdf...
INFO:id2vec:Building the token index...
INFO:similar_repos:Loaded id2vec model: {'created_at': datetime.datetime(2017, 6, 18, 17, 37, 6, 255615),
 'dependencies': [],
 'model': 'id2vec',
 'uuid': '92609e70-f79c-46b5-8419-55726e873cfc',
 'version': [1, 0, 0]}
Shape: (999424, 300)
First 10 words: ['get', 'name', 'type', 'string', 'class', 'set', 'data', 'value', 'self', 'test']
INFO:docfreq:Reading /Users/francesc/.source{d}/docfreq/default.asdf...
INFO:docfreq:Building the docfreq dictionary...
INFO:docfreq:Pruning to min 20 occurrences
INFO:similar_repos:Loaded document frequencies: {'created_at': datetime.datetime(2017, 6, 19, 9, 59, 14, 766638),
 'dependencies': [],
 'model': 'docfreq',
 'uuid': 'f64bacd4-67fb-4c64-8382-399a8e7db52a',
 'version': [1, 0, 0]}
Number of words: 416370
First 10 words: ['aaa', 'aaaa', 'aaaaa', 'aaaaaa', 'aaaaaaa', 'aaaaaaaa', 'aaaaaaaaa', 'aaaaaaaaaa', 'aaaaaaaaaaa', 'aaaaaaaaaaaa']
Number of documents: 112273
INFO:nbow:Reading /Users/francesc/.source{d}/nbow/default.asdf...
INFO:nbow:Building the repository names mapping...
INFO:similar_repos:Loaded nBOW model: {'created_at': datetime.datetime(2017, 6, 19, 9, 16, 8, 942880),
 'dependencies': [{'created_at': datetime.datetime(2017, 6, 18, 17, 37, 6, 255615),
                   'dependencies': [],
                   'model': 'id2vec',
                   'uuid': '92609e70-f79c-46b5-8419-55726e873cfc',
                   'version': [1, 0, 0]},
                  {'created_at': datetime.datetime(2017, 6, 19, 9, 59, 14, 766638),
                   'dependencies': [],
                   'model': 'docfreq',
                   'uuid': 'f64bacd4-67fb-4c64-8382-399a8e7db52a',
                   'version': [1, 0, 0]}],
 'model': 'nbow',
 'uuid': '1e3da42a-28b6-4b33-94a2-a5671f4102f4',
 'version': [1, 0, 0]}
Shape: (112273, 999424)
First 10 repos: ['ikizir/HohhaDynamicXOR', 'ditesh/node-poplib', 'Code52/MarkPadRT', 'wp-shortcake/shortcake', 'capaj/Moonridge', 'HugoGiraudel/hugogiraudel.github.com', 'crosswalk-project/crosswalk-website', 'apache/parquet-mr', 'dciccale/kimbo.js', 'processone/oneteam']
Traceback (most recent call last):
  File "/usr/local/bin/vecino", line 11, in 
    sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/vecino/__main__.py", line 72, in main
    "vocabulary_max": args.vocabulary_max}
  File "/usr/local/lib/python3.6/site-packages/vecino/similar_repositories.py", line 53, in __init__
    assert self._nbow.get_dependency("id2vec")["uuid"] == self._id2vec.meta["uuid"]
AttributeError: 'NBOW' object has no attribute 'get_dependency'

Just in case it's useful, this is what my pip3 list returns:

$ pip3 list
appnope (0.1.0)
args (0.1.0)
asdf (1.3.1)
ast2vec (0.3.6a0)
astropy (2.0.2)
attrs (17.3.0)
bblfsh (2.6.1)
bleach (1.5.0)
cachetools (2.0.1)
certifi (2017.11.5)
chardet (3.0.4)
clint (0.5.1)
cycler (0.10.0)
decorator (4.1.2)
docker (2.6.1)
docker-pycreds (0.2.1)
enum34 (1.1.6)
google-api-core (0.1.1)
google-auth (1.2.1)
google-cloud-core (0.28.0)
google-cloud-storage (1.6.0)
google-resumable-media (0.3.1)
googleapis-common-protos (1.5.3)
grpcio (1.7.0)
grpcio-tools (1.7.0)
h5py (2.7.1)
html5lib (0.9999999)
idna (2.6)
ipykernel (4.6.1)
ipython (6.2.1)
ipython-genutils (0.2.0)
jedi (0.11.0)
jsonschema (2.6.0)
jupyter-client (5.1.0)
jupyter-core (4.4.0)
Keras (2.0.9)
lz4 (0.11.1)
Markdown (2.6.9)
matplotlib (2.1.0)
modelforge (0.3.1a0)
netifaces (0.10.6)
numpy (1.13.3)
olefile (0.44)
parso (0.1.0)
pexpect (4.3.0)
pickleshare (0.7.4)
Pillow (4.3.0)
pip (9.0.1)
pluggy (0.6.0)
prompt-toolkit (1.0.15)
protobuf (3.4.0)
ptyprocess (0.5.2)
py (1.5.2)
pyasn1 (0.4.2)
pyasn1-modules (0.2.1)
Pygments (2.2.0)
pyparsing (2.2.0)
PyStemmer (1.3.0)
pytest (3.3.0)
python-dateutil (2.6.1)
pytz (2017.3)
PyYAML (3.12)
pyzmq (16.0.3)
requests (2.18.4)
rsa (3.4.2)
scipy (0.19.1)
semantic-version (2.6.0)
setuptools (36.5.0)
simplegeneric (0.8.1)
six (1.11.0)
tensorflow (1.4.0)
tensorflow-tensorboard (0.4.0rc2)
tornado (4.5.2)
traitlets (4.3.2)
urllib3 (1.22)
vecino (0.1.5a0)
wcwidth (0.1.7)
websocket-client (0.44.0)
Werkzeug (0.12.2)
wheel (0.30.0)
wmd (1.2.6)

Thanks! I haven't fixed the versions of dependencies properly, and the code expired. Should be working now.

Still not working, then I realized it required a local bblfsh server running.
Is that the case? Does it need to be local and running on port 9432, or is there any way to tune this parameter?

Right, using --bblfsh whatever:9432

Would it be worth creating an issue to improve the error message when the server is not available?