Accenture/AmpliGraph

load_from_ntriples() doesn't work as expected

koustav123 opened this issue · 1 comments

I have a ntriple file named sample_data.nt which has contents as follows:

<http://dbpedia.org/resource/abcd> <http://example.org#industry> <http://dbpedia.org/resource/xyz> .
<http://dbpedia.org/resource/efgh> <http://example.org#industry> <http://dbpedia.org/resource/x1y1z1> .
<http://dbpedia.org/resource/ijkl.> <http://example.org#industry> <http://dbpedia.org/resource/x2y2z2> .

Running the following code:

from ampligraph.datasets import load_from_ntriples
X = load_from_ntriples('', 'sample_data.nt','./')
X

produces this result

array([['<http://dbpedia.org/resource/abcd>',
        '<http://example.org#industry>', '<http://dbpedia'],
       ['<http://dbpedia.org/resource/efgh>',
        '<http://example.org#industry>', '<http://dbpedia'],
       ['<http://dbpedia.org/resource/ijkl.>',
        '<http://example.org#industry>', '<http://dbpedia']], dtype=object)

This can be fixed using this method

def load_frm_nt(filename):
    X = pd.read_csv(filename,
                    sep=r'\s+',
                    header=None,
                    names=None,
                    dtype=str,
                    usecols=[0, 1, 2])
    return X.to_numpy()

which produces the result

array([['<http://dbpedia.org/resource/abcd>',
        '<http://example.org#industry>',
        '<http://dbpedia.org/resource/xyz>'],
       ['<http://dbpedia.org/resource/efgh>',
        '<http://example.org#industry>',
        '<http://dbpedia.org/resource/x1y1z1>'],
       ['<http://dbpedia.org/resource/ijkl.>',
        '<http://example.org#industry>',
        '<http://dbpedia.org/resource/x2y2z2>']], dtype=object)

It will be great to have this fixed!

Thanks for spotting this.

Patch is now available on the develop branch and will be released soon in rel 1.4.1 .

In the meanwhile, use version 1.4-dev which can be installed as:

git clone https://github.com/Accenture/AmpliGraph.git
cd AmpliGraph
git checkout develop
pip install .