abojchevski/graph2gauss

pubmed dataset loading data error

chenjohnai opened this issue · 3 comments

I am running the demo code just as example.ipynb,
I changed the dataset to pubmed as:

g = load_dataset('data/pubmed.npz')
When I initialize the g2g class
g2g = Graph2Gauss(A=A, X=X, L=128, verbose=True)
the error occurs.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-6c49a6402f64> in <module>()
----> 1 g2g = Graph2Gauss(A=A, X=X, L=128, verbose=True)

/mnt/hp_raid/john-works/weibodata/graph2gauss/g2g/model.py in __init__(self, A, X, L, K, p_val, p_test, n_hidden, max_iter, tolerance, scale, seed, verbose)
     61         train_ones, val_ones, val_zeros, test_ones, test_zeros = train_val_test_split_adjacency(
     62             A=A, p_val=p_val, p_test=p_test, seed=seed, neg_mul=1, every_node=True, connected=False,
---> 63             undirected=(A != A.T).nnz == 0)
     64
     65         # pre-compute the hops for each node for more efficient sampling

/mnt/hp_raid/john-works/weibodata/graph2gauss/g2g/utils.py in train_val_test_split_adjacency(A, p_val, p_test, seed, neg_mul, every_node, connected, undirected)
    120             hold_edges_d1 = np.column_stack((not_in_cover[d_nic > 0],
    121                                              np.row_stack(map(np.random.choice,
--> 122                                                               A[not_in_cover[d_nic > 0]].tolil().rows))))
    123
    124             if np.any(d_nic == 0):

/home/aduman/anaconda3/lib/python3.6/site-packages/numpy/core/shape_base.py in vstack(tup)
    232
    233     """
--> 234     return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
    235
    236 def hstack(tup):

ValueError: need at least one array to concatenate


tensorflow version: 1.4.0,
numpy version: 1.14.2,
scipy version: 1.0.0

Thank you for submitting this issue. Unfortunately, I am not able to reproduce it on my end.

The issue seems to be related to the train-validation-test split of the edges/non-edges. Did you modify the p_val and p_test parameters or did you keep their default values?

Tensorflow: 1.4.1
Numpy: 1.14.2
Scipy 1.0.0

What's the OS you are using and the version of Networkx?

I tested it on Ubuntu 17.10 and Ubuntu 16.04.
Networkx: 2.0

Hi, thank you for the reply.
I found that it is the version of Networkx causes the error.

My previous testing was on Networkx 1.11 and debian 8,
I upgrade the Networkx to 2.1, and the error is gone.