Basic Prolog interpreter implemented in Haskell

Course project in functional programming

Run

In the same directory as the .cabal file type in the console:

cabal build

Example how to load a prolog file through the program:

cabal run prolog-interpreter-hs -- app/Tests/test.pl

Note: if an error occurs during parsing, the program will alert that a parsing error has occured, however the program won't exit and the database will be filled with the already parsed part of the prolog file.

Entering queries:

After successfully compiling the prolog file, one may enter queries in the following format: ?- <atom>. where <atom> is defined under the tokens paragraph.

If the query is not parsed correctly, the program should return the empty list.

Issue: if the query ends with a space character the program may go in a state of infinite recursion. It should be fixed in the near future.

If no solutions are found or the query is tautology then the program will return the empty list. Note: yes and no answers should be added in the future.

Tokens

A prolog program has the following syntax:

<identfier> := <lowercase_letter>[<letter_or_number>]
<variable> := <uppercase_letter>[<letter_or_numer>]
<constant> := <identifier>
<term> := <constant> | <variable> | <identifier>(<term>, [<term>])
<atom> := <identifier>(<term>, [<term>])
<fact> := <atom>.
<rule> := <atom> :- <atom>[, <atom>].

Note: underscores are not supported.

Valid operators are rule operator :- , comma operator ,, dot operator ., brackets (, ). Valid statements are rules, facts and queries.

Parsing

Parsers are applied in the following order: Firstly, the file is divided into an array of symbols that go through parserA. parserA reads strings and operators. parserB removes unnecessary spaces. parserC reads terms with nesting. parserD reads atoms without nesting, which means nested atoms are treated as terms. This is not an issue when converting to intermediate representation. parserE reads facts, rules and queries.

The ast function generates abstract syntax tree from the tokenized statements. The compileFile function parses a text file and returns a Program object of AST. If the parsing is unsuccessful, then the function returns Nothing.

Unification

The intermediate representation data types start with either P or PL. The main difference between the AST and IR is that terms can be either an atomic formula (a function with arguments, or a constant) or a variable. Constants are represented as functions with no arguments.

A unifier (sometimes denoted in the code as mgu as most general unifier) is a set of equation - PLEquation (or PLSubstitution). Unification is the process of trying to find a substitution that makes two terms equal. The unification algorithm pseudocode is as follows:

Initialise the MGU to an empty unifier
Push T1 = T2 to the stack
While the stack is not empty
	Pop X = Y from the stack
	       case: X is a variable AND X does not occur in Y
        	Create unifier U such that X = Y
                Apply U to MGU
                Add U to MGU
                Apply U to stack
        case: Y is a variable AND Y does not occur in X
        	Create unifier U such that Y = X
                Apply U to MGU
                Add U to MGU
                Apply U to stack
        case: X and Y are identical constants or variables
        	do nothing
        case: X is of form p(a0,..,an) and Y is of form p(b0,..,bn)
		        For m = 0 to n
                	push am = bm to the stack
        default case:
        	Failure
Return the MGU

Note: the algorithm is borrowed from the book The Art of Prolog.

Resolution

Database represents AST converted into IR.

The resolution algorithm is as follows:

Input: Goal and program P
Output: Instance of G that is logical consequence of P

tree = Tree Empty [Tree $ Node query [] $ []]

resolve tree =
    solutions []
    newTree = tree 
    foreach (child in tree.children)
        nodes = genn child
        foreach (node in nodes)
            if node.unifier is Nothing then
                remove node from nodes
            else if node.unifier is [] then
                add node.mgu to solutions
            else continue
        newTree = add nodes to newTree
    return solutions ++ resolve newTree

Algorithm for generating children nodes (genn):

foreach pclause{term = body} in DB:
     let u = plUnify query term in
     case u of
         Nothing -> continue
         Just _ -> return Node with u and newQuery = apply u to query
goal reduction

The function mergeUnifiers has the job to solve the systems of equations generated by the resolution algorithm. The idea is to skip renaming and treat variables as local vars to a unifier. Each unifier is a system of equation on it's own. In general, unifiers are traversed from the last to the first (left to right) and substitutions are applied whenever necessary. The method returns the substituted first unifier where each equation should not have a variable present in the right side.

Possible alternative algorithm for mergeUnifiers (not tested):

# Merges unifiers into the last unifier of the stack
# Returns solution
unifier_list = (θx : θs)    # foreach θx in θs
if x == 1 then return θx else
foreach eq in θx:
    foreach θ in θs:
        foreach eq' in θ:
            if eq'.right is JustPvar and eq.var == eq'.right:
                eq'.right = eq.right
                break
        break

The function limit filters the equations that are needed to form a complete solution (however the impementation is not optimal)

Testing

Note: to-do: add HUnit.

arctfx/prolog-interpreter-hs