anhaidgroup/deepmatcher

geting error when train model in google colab

Closed this issue · 6 comments

i use google colab
when i tried to process data with fasttext in french language i set it like this :

train_set,validation_set = dm.data.process(
    path='drive/My Drive/recommandersystem/deepmatcher_model',
    cache='train_cache.pth',
    train='train.csv',
    validation='valid.csv',
    embeddings='fasttext.fr.bin', 
    embeddings_cache_path='drive/My Drive/recommandersystem/deepmatcher_model',
    ignore_columns=['id',''],
    id_attr='_id', 
    label_attr='label',
    left_prefix='ltable_', 
    right_prefix='rtable_')

and i get this error message :

HTTPError Traceback (most recent call last)
in ()
11 label_attr='label',
12 left_prefix='ltable_',
---> 13 right_prefix='rtable_')

13 frames

/usr/lib/python3.6/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
648 class HTTPDefaultErrorHandler(BaseHandler):
649 def http_error_default(self, req, fp, code, msg, hdrs):
--> 650 raise HTTPError(req.full_url, code, msg, hdrs, fp)
651
652 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 403: Forbidden

please how i can solve this

I guess colab does not allow to download anything. First download the file manually and place it in your Google Drive. I guess it tries to download fasttext vectors. You can find the link in code

and how i can put it in the data process ??

Through parameter of data.process, set the path to file

with this parameter "embeddings_cache_path" ?? like this :

embeddings='fasttext.fr.bin', 
embeddings_cache_path='drive/My Drive/recommandersystem/deepmatcher_model', 

Did it work @walide67 ?

FastText for non-English languages (e.g. embeddings='fasttext.fr.bin') should now work with the latest release (0.1.2.post1). It appears Facebook changed the URLs to their word embeddings. Please reopen if the error persists.