tomasonjo/blogs

Error at # Define random walk query

Closed this issue · 1 comments

Following along. I have worked within NEO4J desktop and the data appears correct.
I am getting at error trying to implement your code for word2vec.

`# Define random walk query
random_walks_query = """

MATCH (node)
CALL gds.alpha.randomWalk.stream('all', {
start: id(node),
steps: 15,
walks: 5
})
YIELD nodeIds
// Return the names or the titles
RETURN [id in nodeIds |
coalesce(gds.util.asNode(id).name,
gds.util.asNode(id).title)] as walks

"""

Fetch data from Neo4j

with driver.session() as session:
walks = session.run(random_walks_query)

Train the word2vec model

clean_walks = [row['walks'] for row in walks]
model = Word2Vec(clean_walks, sg=1, window=5, size=100)

Inspect results

model.most_similar('olive oil')`

I am getting:

TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

TypeError Traceback (most recent call last)
in
20 # Train the word2vec model
21 clean_walks = [row['walks'] for row in walks]
---> 22 model = Word2Vec(clean_walks, sg=1, window=5, size=100)
23 # Inspect results
24 model.most_similar('olive oil')

TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

This error means that some of the input for the word2vec algorithm contains null value. Are you using the same dataset as I have? Try to get the random walk query to return no Null values or filter them out later in python. Also, do you have the projected graph loaded in memory?