Neo4j

Queres

Queries:

  1. Which are the top 5 authors with the most citations (from other papers). Return author names and number of citations.
  2. Which are the top 5 authors with the most collaborations (with different authors). Return author names and number of collaborations.
  3. Which is the author who has wrote the most papers without collaborations. Return author name and number of papers.
  4. Which author published the most papers in 2001? Return author name and number of papers.
  5. Which is the journal with the most papers about “gravity” (derived only from the paper title) in 1998. Return journal and number of papers.
  6. Which are the top 5 papers with the most citations? Return paper title and number of citations.
  7. Which were the papers that use “holography” and “anti de sitter” (derived only from the paper abstract). Return authors and title.
  8. Find the shortest path between ‘C.N. Pope’ and ‘M. Schweda’ authors (use any type of edges). Return the path and the length of the path. Comment about the type of nodes and edges of the path.
  9. Run again the previous query (8) but now use only edges between authors and papers. Comment about the type of nodes and edges of the path. Compare the results with query 8.
  10. Find all authors with shortest path lengths > 25 from author ‘Edward Witten’. The shortest paths will be calculated only on edges between authors and articles. Return author name, the length and the paper titles for each path.

IMPORTING DATA

create constraint on (a:Author) assert a.name is unique; 
create constraint on (o:Article) assert o.id is unique;

loading article nodes 29555 nodes

LOAD CSV FROM "file:///ArticleNodes.csv" as row
CREATE (a:Article{id: toInteger(row[0])})
set a.title = row[1],
a.year = date(row[2]),
a.jurnal = row[3],
a.abstract = row[4]
LOAD CSV FROM "file:///AuthorNodes.csv" as row
Merge (a:Author{name: row[1]})

Added 15420 labels, created 15420 nodes, set 15420 properties, completed after 833 ms.

LOAD CSV FROM "file:///AuthorNodes.csv" as row
match(a:Article{id: toInteger(row[0])})
match(o:Author{name: row[1]})
create(o)-[:WROTE]->(a)
LOAD CSV FROM "file:///Citations.csv" as row
with row, split(row[0], "	") as srow
match(article1:Article{id: toInteger(srow[0])})
match(article2:Article{id: toInteger(srow[1])})
where article1.id <> article2.id
create(article1)-[:REFRERENCES]->(article2)

one

match(a:Article)-[:REFERENCES]->(reciver:Article) match(o:Author)-[:WROTE]->(reciver:Article)
return o.name, count(reciver) as count order 
by count desc 
limit 5

Created 58340 relationships, completed after 1529 ms.

two

MATCH(author1:Author)-[:WROTE]->(a:Article)
MATCH(author2:Author)-[:WROTE]->(a)
where author1.name <> author2.name
MERGE (author1)-[:COLLABORATED]->(author2)

//then 

MATCH(a:Author)-[c:COLLABORATED]->(b:Author)
return a.name , count(c) as count
order by count desc
limit 5
author colaborations
"C.N. Pope" 50
"S. Ferrara" 46
"M. Schweda" 46
"H. Lu" 45
"C. Vafa" 45

three

match(a:Author)-[:WROTE]->(article:Article)
with article as article,count(*) as c, a as a
where c<2
return a.name, count(a.name) as amount
order by amount desc 
author name number of papers
"C.N. Pope" 127
"H. Lu" 122
"A.A. Tseytlin" 111
"Edward Witten" 98
"Shinichi Nojiri" 92

for

match(o:Author)-[:WROTE]->(a:Article)
where a.year = date("2001")
return o.name, count(o) as count
order by count desc limit 1

result -> "Ashok Das" 17

five

match(a:Article)
where a.title CONTAINS 'gravity' and a.year = date("1998")
return a.jurnal, count(*) as rank
order by rank desc
limit 1

result -> "Nucl.Phys." 25

six

match(:Article)-[:REFERENCES]->(a:Article)
return a.title, count(*) as rank
order by rank desc
limit 5

result:

title citations
The Large N Limit of Superconformal Field Theories and Supergravity" 2414
Anti De Sitter Space And Holography" 1775
"Gauge Theory Correlators from Non-Critical String Theory" 1641
"Monopole Condensation And Confinement In N=2 Supersymmetric Yang-Mills" 1299
"M Theory As A Matrix Model: A Conjecture" 1199

seven

match(o:Author)-[:WROTE]->(a:Article)
where a.abstract contains "holography" and a.abstract contains "anti de sitter"
return o.name, a.title

result-> null

match(o:Author)-[:WROTE]->(a:Article)
where a.abstract contains "holography" 
return o.name, a.title

result:

author name title
"Petr Horava" "Probable Values of the Cosmological Constant in a Holographic Theory"
"Djordje Minic" "Probable Values of the Cosmological Constant in a Holographic Theory"
"U. Moschella" "Decomposing Quantum Fields on Branes"
"R. Schaeffer" "Decomposing Quantum Fields on Branes"
"J. Bros" "Decomposing Quantum Fields on Branes"
"M. Bertola" "Decomposing Quantum Fields on Branes"
"V. Gorini" "Decomposing Quantum Fields on Branes"
"Itzhak Bars" "Two-Time Physics in Field Theory"
"J.G. Russo" "Hyperbolic Spaces in String and M-Theory"
"A. Kehagias" "Hyperbolic Spaces in String and M-Theory"
match(o:Author)-[:WROTE]->(a:Article)
where a.abstract contains "anti de sitter" 
return o.name, a.title

return -> null

eight

match(a:Author{name: 'C.N. Pope'})
match(b:Author{name: 'M. Schweda'})
match p=shortestPath((a)-[*]-(b))
return p

result -> length = 4

nine

match(a:Author{name: 'C.N. Pope'})
match(b:Author{name: 'M. Schweda'})
match p=shortestPath((a)-[:WROTE]-(b))
return p, length(p) as length

no result like no papers connect the two