OpenCog AtomSpace

The OpenCog AtomSpace is a knowledge representation (KR) database and the associated query/reasoning engine to fetch and manipulate that data, and perform reasoning on it. Data is represented in the form of graphs, and more generally, as hypergraphs; thus the AtomSpace is a kind of graph database, the query engine is a general graph re-writing system, and the rule-engine is a generalized rule-driven inferencing system. The vertices and edges of a graph, known as "Atoms", are used to represent not only "data", but also "procedures"; thus, many graphs are executable programs as well as data structures.

There are pre-defined Atoms for many basic knowledge-representation and computer-science concepts. These include Atoms for relations, such as similarity, inheritance and subsets; for logic, such as Boolean and, or, for-all, there-exists; for Bayesian and other probabilistic relations; for intuitionist logic, such as absence and choice; for parallel (threaded) synchronous and asynchronous execution; for expressions with variables and for lambda expressions and for beta-reduction and mapping; for uniqueness constraints, state and a messaging "blackboard"; for searching and satisfiability and graph re-writing; for the specification of types and type signatures, including type polymorphism and type construction (dependent types and type variables TBD).

Because of these many and varied Atom types, constructing graphs to represent knowledge looks kind-of-like "programming"; the programming language is informally referred to as "Atomese". It vaguely resembles a strange mashup of SQL (due to queriability), prolog/datalog (due to the logic and reasoning components), lisp/scheme (due to lambda expressions), haskell/caml (due to the type system) and rule engines (due to the graph rewriting and forward/backward chaining inference systems). This "programming language" is NOT designed for use by human programmers (it is too verbose and awkward for that); it is designed for automation and machine learning. That is, like any knowledge representation system, the data and procedures encoded in "Atomese" are meant to be accessed by other automated subsystems manipulating and querying and inferencing over the data/programs. Also, viewed as a programming language, it can be very slow and inefficient and not scalable; it was not designed with efficiency and programming tasks in mind, nor with scalability; but rather, it was designed to allow the generalized manipulation of networks of probabilistic data by means of rules and inferences and reasoning systems. It extends the idea of probabilistic logic networks to a generalized system for automatically manipulating and managing data.

The use of the AtomSpace, and the operation and utility of Atomese, remains a topic of ongoing research and change, as various dependent subsystems are brought online. These include machine learning, natural language processing, motion control and animation, planning and constraint solving, pattern mining and data mining, question answering and common-sense systems, and emotional and behavioral psychological systems. Each of these impose sharply conflicting requirements on the system architecture; the AtomSpace and "Atomese" is the current best-effort KR system for satisfying all these various needs in an integrated way. It is likely to change, as the various current short-comings, design flaws, performance and scalability issues are corrected.

The main project site is at http://opencog.org

The examples directory contains demonstrations of the various components of the AtomSpace, including the python and scheme bindings, the pattern matcher, the rule engine, and many of the various different atom types and their use for solving various different tasks.

Prerequisites

To build the OpenCog AtomSpace, the packages listed below are required. With a few exceptions, most Linux distributions will provide these packages. Users of Ubuntu 14.04 "Trusty Tahr" may use the dependency installer at /scripts/octool. Users of any version of Linux may use the Dockerfile to quickly build a container in which OpenCog will be built and run.

boost

C++ utilities package.
http://www.boost.org/ | apt-get install libboost-dev

cmake

Build management tool; v2.8 or higher recommended.
http://www.cmake.org/ | apt-get install cmake

cogutil

Common OpenCog C++ utilities.
http://github.com/opencog/cogutils
It uses exactly the same build procedure as this package. Be sure to sudo make install at the end.

guile

Embedded scheme REPL (version 2.0.9 or newer is required).
http://www.gnu.org/software/guile/guile.html | apt-get install guile-2.0-dev

Optional Prerequisites

The following packages are optional. If they are not installed, some optional parts of the AtomSpace will not be built. The CMake command, during the build, will be more precise as to which parts will not be built.

cxxtest

Test framework
Optional but recommended; required for running unit tests.
http://cxxtest.sourceforge.net/ | https://launchpad.net/~opencog-dev/+archive/ppa

Cython

C bindings for Python.
Strongly recommended, as many examples and important subsystems assume python bindings.
http://cython.org | apt-get install cython

Haskell

Haskell bindings (experimental).
Optional; almost no existing code makes use of haskell.
https://www.haskell.org/

Postgres

Distributed, multi-client networked storage.
Needed for "remembering" things between shutdowns.
http://postgres.org | apt-get install postgresql postgresql-client

unixODBC

Generic SQL Database client access libraries.
Required for the distributed-processing atomspace.
http://www.unixodbc.org/ | apt-get install unixodbc-dev

ZeroMQ (version 3.2.4 or higher)

Asynchronous messaging library.
Optional, almost completely unused, mostly due to poor performance.
http://zeromq.org/intro:get-the-software | apt-get install libzmq3-dev

Google Protocol Buffers

Google's data interchange format (used by ZeroMQ).
Optional, needed only for ZMQ, above.
https://developers.google.com/protocol-buffers | apt-get install libprotobuf-dev

Building AtomSpace

Perform the following steps at the shell prompt:

    cd to project root dir
    mkdir build
    cd build
    cmake ..
    make

Libraries will be built into subdirectories within build, mirroring the structure of the source directory root.

Unit tests

To build and run the unit tests, from the ./build directory enter (after building opencog as above):

    make test

Install

After building, you MUST install the atomspace.

    sudo make install

Using the AtomSpace

The AtomSpace can be used in one of three ways, or a mixture of all three: By using the GNU Guile scheme interface, by using Python, or by running the OpenCog cogserver.

Guile provides the easiest interface for creating atoms, loading them into the AtomSpace, and performing various processing operations on them. For examples, see the /examples/guile and the /examples/pattern-matcher directories.

Python is more familiar than scheme (guile) to most programmers, and it offers another way of interfacing to the atomspace. See the /examples/python directory for how to use python with the AtomSpace.

The OpenCog cogserver provides a network server interface to OpenCog. It is required for running embodiment, some of the reasoning agents, and some of the natural-language processing agents. The cogserver is only available in the main OpenCog project; it is not a part of the AtomSpace.

CMake notes

Some useful CMake's web sites/pages:

The main CMakeLists.txt currently sets -DNDEBUG. This disables Boost matrix/vector debugging code and safety checks, with the benefit of making it much faster. Boost sparse matrixes and (dense) vectors are currently used by ECAN's ImportanceDiffusionAgent. If you use Boost ublas in other code, it may be a good idea to at least temporarily unset NDEBUG. Also if the Boost assert.h is used it will be necessary to unset NDEBUG. Boost ublas is intended to respond to a specific BOOST_UBLAS_NDEBUG, however this is not available as of the current Ubuntu standard version (1.34).

talkhaldi/atomspace