patzaw/neo2R

Error importing from dataframe

Closed this issue · 4 comments

I'm following the example to import from a dataframe and I'm getting this error:

> import_from_df(
+   graph=graph,
+   cql='MERGE (n:TestNode {name:row.name, value:toFloat(row.value)})',
+   toImport=nodes
+ )
Neo.ClientError.Statement.ExternalResourceFailed
Couldn't load the external resource at: file:/var/lib/neo4j/import/file385055f926f8
Error in cypher(graph = graph, query = cql, ...) : neo4j error

System information

Windows 10
WSL2 Ubuntu 20:04

R information

R v4.0.5
neo2r v2.1.0

Neo4J

Running Neo4J using
docker run -d -p 7473:7473 -p 7474:7474 -p 7687:7687 --name neo4j-server neo4j-server

Note

I set all of the environment variables this way:

Sys.setenv(HOME="C:/Users/balter")

Sys.setenv(
  CONTAINER=neo4j_cont,
  NJ_VERSION="4.3.4",
  NJ_HTTP_PORT=7474,
  NJ_BOLT_PORT=7687,
  ### Authorization
  ### ---set to 'none' if you want to disable authorization
  NJ_AUTH="neo4j/n4jpw",  
  ### APOC download
  ### --- should already be installed in the docker image
  # export NJ_APOC=https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/4.0.0.2/apoc-4.0.0.2-all.jar
  # export NJ_APOC=https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/3.5.0.7/apoc-3.5.0.7-all.jar
)

Sys.setenv(
  ### Change the location of the Neo4j directory
  NJ_HOME=file.path(Sys.getenv("HOME"), "neo4j_home"),
  NJ_IMPORT=file.path(Sys.getenv("HOME"), "neo4jImport"),
  NJ_PLUGINS=file.path(Sys.getenv("HOME"), "neo4j_home", "neo4jPlugins"),
  NJ_DATA=file.path(Sys.getenv("HOME"), "neo4j_home", "neo4jData")
)

Hi

I think you forgot the following option when running your docker container:

--volume $NJ_IMPORT:/var/lib/neo4j/import

Then, in R, when you connect to neo4j with startGraph(), you must specify the import directory ($NJ_IMPORT variable) with the importPath parameter. For example:

library(neo2R)
graph <- startGraph(
  "localhost:7474",
  username="neo4j", password="1234",
  importPath="~/neo4j_home/neo4jImport"
)

I hope it helps. Tell me if it works or if you need additional information

I still haven't gotten it to work. However, before I try any further, I have a question. Does import_from_df work automatically? Does it create nodes with all columns? Does it include the schema? I'm hoping to find a simple way to import a dataframe COMPLETELY without having to write cypher code--if such a thing exists.

@patzaw -- I got it to work finally. One problem was that I was using R in Windows (RStudio) but docker in WSL2. I switched to using RStudio in WSL2. I volume mapped the import directory as you suggested. Now it runs smoothly.

In principle, one could automate the import without needing the cql by building the query using the column names and datatypes of those columns. I'm just starting out, so I don't see any other cql statement I would want to use on a simple import. But maybe down the road it will become apparent why I would want to write the cql rather than black box it.

I'm glad you got it running finally. Thanks for keeping me informed.
Personally, I prefer writing cypher queries since they fit the philosophy behind neo4j and it makes the model clearer in my mind. neo2R do not provide any way to import data without cql, but you can indeed imagine one.