/NLM

Memory for Knowledge Graph, using Neo4j. 知识图谱存储与查询。

Primary LanguagePythonMIT LicenseMIT

README

This is a repo focused on NLP memory. Specifically, memorize (store) a node or relationship to the knowledge graph (Actually a Neo4j database instance). And recall (query) a node or relationship from the memory. It's not only a module, but also a RPC service which can be easily setup.

Here are some scenes:

  • When input is a node or relationship
    • Use several information of a node or relationship to recall a node or a relationship with full information in the memory.
    • Automatically add a node or relationship when there is nothing to recall.
    • Automatically update the properties of a node or relationship when a node or relationship has been recalled.
  • When input is a raw string or a NLU output
    • Automatically extract nodes or relationships from the input.
    • Then do the things above.

The extractor is in development.

Furthermore, recalls are based on nodes (label and name) and relationships (start, end, kind), and their properties are mainly used to sort the results.

中文文档和设计**:自然语言记忆模块(NLM) | Yam

Setup

IMPORTANT: only support Python3.7+.

  • Step 1: Install dependencies

    # use pipenv
    $ pipenv install --dev
    # do not have pipenv
    $ python3 -m venv env
    $ source env/bin/activate
    $ pip install -r requirements.txt
  • Step 2: Setup a neo4j database

    $ docker run --rm -it -p 7475:7474 -p 7688:7687 neo4j

    Here we use another two ports for play and test.

    When the docker has been set up, you should open http://localhost:7475/browser/, modify the port to 7688, input the password neo4j and then change the password to password

  • Step 3: Running the tests

    $ pipenv shell
    $ pytest

    This step will add 8 nodes and relationships to your Neo4j database.

The document is under ./docs which can be generated by Sphinx, just run make html.

Usage

Module

from py2neo.database import Graph
from nlm import NLMLayer, GraphNode, GraphRelation

mem = NLMLayer(graph=Graph(port=7688), 
               fuzzy_node=False,
               add_inexistence=False,
               update_props=False)

############ Node ############
# recall
node = GraphNode("Person", "AliceThree")
mem(node)
[GraphNode(label='Person', name='AliceThree', props={'age': 22, 'sex': 'male'})]

# add inexistence, here `add_inexistence=True` has covered the NLMLayer config.
new = GraphNode("Person", "Bob")
mem(new, add_inexistence=True)
[]

# fuzzy recall
node = GraphNode("Person", "AliceT")
mem(node, fuzzy_node=True)
[GraphNode(label='Person', name='AliceTwo', props={'age': 21, 'occupation': 'teacher'})]

# update property
node = GraphNode("Person", "AliceThree", props={"age": 24})
mem(node, update_props=True)
[GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'})]

# topn
node = GraphNode("Person", "AliceT")
mem(node, fuzzy_node=True, topn=2)
[GraphNode(label='Person', name='AliceTwo', props={'age': 21, 'occupation': 'teacher'}),
 GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'})
]


############ Relation ############

# recall
start = GraphNode("Person", "AliceThree")
end = GraphNode("Person", "AliceOne")
relation = GraphRelation(start, end, "LOVES")
mem(relation)
[GraphRelation(
    start=GraphNode(label='Person', name='AliceThree', props={'age': 22, 'sex': 'male'}),
    end=GraphNode(label='Person', name='AliceOne', props={'occupation': 'teacher', 'age': 22, 'sex': 'female'}), 
    kind='LOVES', 
    props={'from': 2011, 'roles': 'husband'})
]

# add inexistence
start = GraphNode("Person", "AliceThree")
end = GraphNode("Person", "Bob")
relation = GraphRelation(start, end, "KNOWS")
mem(relation, add_inexistence=True)
[]

# fuzzy recall
start = GraphNode("Person", "AliceTh")
end = GraphNode("Person", "AliceO")
relation = GraphRelation(start, end, "LOVES")
mem(relation, fuzzy_node=True)
[GraphRelation(
    start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}), 
    end=GraphNode(label='Person', name='AliceOne', props={'occupation': 'teacher', 'age': 22, 'sex': 'female'}), 
    kind='LOVES', 
    props={'from': 2011, 'roles': 'husband'})
]

# two nodes, topn
start = GraphNode("Person", "AliceThree")
end = GraphNode("Person", "AliceOne")
relation = GraphRelation(start, end)
mem(relation, topn=3)
[GraphRelation(
    start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}), 
    end=GraphNode(label='Person', name='AliceOne', props={'occupation': 'teacher', 'age': 22, 'sex': 'female'}), 
    kind='WORK_WITH', 
    props={'from': 2009, 'roles': 'boss'}),
 GraphRelation(
     start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}), 
     end=GraphNode(label='Person', name='AliceOne', props={'occupation': 'teacher', 'age': 22, 'sex': 'female'}), 
     kind='LOVES', 
     props={'from': 2011, 'roles': 'husband'})
]

# update property (relationship)
start = GraphNode("Person", "AliceThree")
end = GraphNode("Person", "Bob")
relation = GraphRelation(start, end, "KNOWS", {"roles": "classmate"})
mem(relation, update_props=True)
[GraphRelation(
    start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}), 
    end=GraphNode(label='Person', name='Bob', props={}), 
    kind='KNOWS', 
    props={})
]
mem(relation)
[GraphRelation(
    start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}), 
    end=GraphNode(label='Person', name='Bob', props={}), 
    kind='KNOWS', 
    props={'roles': 'classmate'})
]

# update property (node + relationship)
start = GraphNode("Person", "AliceThree")
end = GraphNode("Person", "Bob", {"sex": "male"})
relation = GraphRelation(start, end, "KNOWS", {"roles": "friend"})
mem(relation, update_props=True)
[GraphRelation(
    start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}), 
    end=GraphNode(label='Person', name='Bob', props={'sex': 'male'}), 
    kind='KNOWS', 
    props={'roles': 'friend'})
]

start = GraphNode("Person", "AliceThree")
end = GraphNode("Person", "Bob", {"sex": "male"})
relation = GraphRelation(start, end, "STUDY_WITH", {"roles": "classmate"})
mem(relation, update_props=True)
mem(relation)
[GraphRelation(
    start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}), 
    end=GraphNode(label='Person', name='Bob', props={'sex': 'male'}), 
    kind='STUDY_WITH', 
    props={'roles': 'classmate'})
]

mem(GraphRelation(start, end), topn=2)
[GraphRelation(
    start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}), 
    end=GraphNode(label='Person', name='Bob', props={'sex': 'male'}), 
    kind='STUDY_WITH', 
    props={'roles': 'classmate'}),
 GraphRelation(
     start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}), 
     end=GraphNode(label='Person', name='Bob', props={'sex': 'male'}), 
     kind='KNOWS', 
     props={'roles': 'friend'})
]


############ RawString and NLU Output ############
# will first extract nodes or relationships, then like the above.
# will coming soon.


############ Graph ############
mem.labels
frozenset({'Person'})

mem.relationship_types
frozenset({'KNOWS', 'LIKES', 'LOVES', 'STUDY_WITH', 'WORK_WITH'})

mem.nodes_num
9

mem.relationships_num
10

mem.nodes
# all nodes generator

mem.relationships
# all relationships generator

mem.query("MATCH (a:Person) RETURN a.age, a.name LIMIT 5")
[{'a.age': 21, 'a.name': 'AliceTwo'},
 {'a.age': 23, 'a.name': 'AliceFour'},
 {'a.age': 22, 'a.name': 'AliceOne'},
 {'a.age': 24, 'a.name': 'AliceFive'},
 {'a.age': None, 'a.name': 'Bob'}
]

Since our mem is actually inherited from the py2neo.Graph, all the functions in the py2neo.Graph can be called through mem. We just make it more convenient and easy to use, especially focus on storage and query.

In addition, when fuzzy_node is True, properties will not be updated. Because the query might be a fuzzy node which does not have the properties we have sent in.

RPC Service

In the gRPC service, you have to have the parameters be set when you are running the serve.

$ python server.py [OPTIONS]

Options:
	-fn fuzzy_node
	-ai add_inexistence
	-up update_props

You could use any programming language in the client side, more detail please read gRPC.

There are total 4 interfaces here:

  • NodeRecall
  • RelationRecall
  • StrRecall
  • NLURecall

The last two is still in development. There is a python client example (client.py) in the repo.

Why

The original intention is to build a memory part for chatbot. We just want the chatbot to automatically memorize the nodes and relationships discovered in dialogue. The input was defined to be the output of NLU (understand) layer. We also want to use the information when the chatbot is responding. So the output was defined to be the input of NLG (generate) layer or NLI (infer) layer. That's it.

Batch

We have also written an example (under ./batch_example) to add many nodes and relationships in one time. The data comes from QASystemOnMedicalKG, feel free to modify the code to fit your demand.

Changelog

  • 191201 create