This library is being designed with the goal to learn explicit operations over unstructured text.
The repository is under heavy development. Here be dragons.
The project does not provide a stand-alone service as of now, just clone the repository and install via pip to import
it into your own code.
git clone http://github.com/semanchester/estrella
cd estrella
pip install -r requirements.txt
pip install .
Make sure the Graphene service is set up and running. Create a file "sample.txt" with this content
The chicken crossed the road , because it could .
Run this code
#!/usr/bin/env python
from estrella.main import Estrella
main = Estrella()
main.run_pipeline("extended_pipeline", location="sample.txt")
from estrella.serialize.readable.oie import simple_print
facts = main.docs.hop(link_name="facts")
print(simple_print(facts))
from estrella.model.oie import ContextLabel
# Why did the chicken cross the road?
answer = facts.filter(subject="The chicken", predicate="cross", object="the road").hop(link_name="links", constraint=lambda link: link.label == ContextLabel.Cause)
print(answer.serialize(serialize_with=simple_print))
vectors = [f.numerify() for f in facts]
For documentation, refer to the source code.
In general, Pipeline
s are used to read text and enrich it with external information (such as word embeddings or information extraction). The default_pipeline
for example, defined in the default config config/default.conf
will read a plain text file and convert it our internal representation. extended_pipeline
will do just that and in addition enrich the words with word2vec embeddings and extract information from the text.
View
s implement operations over collections of Concept
s, such as Fact
s and can be serialized to be interpreted by humans or further processed with other tools.