opencog/atomspace

UnionLink vs. UnionTruthValue - how to compute TV?

linas opened this issue · 3 comments

linas commented

Pull request #2814 exposes an issue in the current Atomese design, How might it be resolved? What are the possible future directions?

Pull Req #2814 wants to change the static type-checker to allow non-evaluatable atoms to pass type-checking. This is kind-of crazy, since the whole point of type-checking, is to (cough) verify that expressions are legal. But what determines a "legal expression"?

Some background, first. In the early days of "Atomese", at Hanson Robotics, there was a need to (cog-evaluate! (SequentialAnd (recognize-person) (Or (smile) (blink) (say-hello)))) which means that recognize-person had to be an atom that returned a (crisp) truth-value - if it returned false, the evaluation would stop. If one of the atoms was not actually evaluatable (e.g. was a ConceptNode) it would crash. That is because there is no C++ ConceptNode::evaluate() method. So the checker was invented to catch these kinds of errors early, while the system was being booted, instead of late, when the robot was onstage, performing.

So now you see what the problem is -- if you call (cog-evaluate! (And (Concept "foo"))) or (cog-evaluate! (Not (Set))) an exception will be thrown. Adding all of these types is just sabotaging the type checker so that it allows invalid expressions. What's the point of type-checking if most of the rules are not enforced?

There are 3-4 different solutions:

  1. Get rid of the static type-checker, or at least, stop checking to see if an atom is evaluatable. Cons: no more safety net. Anything goes. Bugs that might have been caught early and would be easy to fix are now caught late, when they are hard to debug. "Hard to debug" is a major killer of software.

  2. Get rid of check_evaluatable and instead create a robot-code-static-typechecker module, and another called pln-static-typechecker. These would be invoked before the robot or pln subsystem run. Cons: there is no progressive checking. Suppose the PLN checker checks 100K atoms for correct syntax. Now suppose the user adds one atom. Next time, does PLN have to check 100K+1 atoms? That's a lot of redundant, repeated work.

  3. Invent something that would allow pln to overload the C++ ConceptNode::evaluate() method with custom code that does what PLN needs it to do. (I've got several ideas on how to do this. (below) None of them are simple.)

  4. Remove large parts of atomese from the atomspace, and move it to "robot-atomese" and "pln atomese" so that what's left in the atomspace is "common atomese that everyone uses".

linas commented

This is a tangled issue, so the following comments are wide-ranging.

Nil suggests introducing UnionLink, IntersectionLink and ComplementLink, which would know how to work with ConceptNode and SetLink and return meaningful values. Then, AndLink, OrLink, NotLink would be restricted to evaluatable atoms that return crisp true/false values.

This is a good idea for several reasons.

  • It associates a single, unambiguous TV formula to each atom type. It solves the problem where one user calls (And ...stuff) and is expecting a crisp true/false value, while another user is expecting a probabilistic result.
  • It allows the TV formula to be coded in C++, where it's fast, as opposed to (GroundedSchema "scm:compute-intersection") which is slow, because it has to bounce between C++ and scheme.
  • It allows the static type checker to continue working as designed.
linas commented

Hmm. As I attempt to think about this, I realize that perhaps UnionLink etc is the simplest solution that makes all my vague and hard-to-describe confusions melt away and/or be deferred to some distant future.

@ngeiswei we can add UnionLink either to opencog/pln or to opencog/atomspace. It t might work well if it was in opencog/atompsace, since it seems very generic. These would be C++-backed atoms, with a C++ evaluate() method that encodes whatever TV formulas you want. (i.e. there must not be dependencies on URE)

The current type-checker could be left as-is until user-land code is ported over.

There are two ways to code the formulas for the TV. One way is the brute-force, super-simple implementation:

TruthValuePtr UnionLink::evlauate()  {
   for (const Handle& h : _outgoingSet) {
      sum += h->get_tv();  // or whatever
   } 
   return sum;
}

The other is more subtle but maybe much more flexible: create a UnionTruthValue class: it does the same for-loop as above. The difference is that you can attach the UnionTruthValue anywhere, and it will "just work right" and compute the right formula. In particular, it means that we do NOT need a UnionLink, and that OrLink is enough:

TruthValuePtr OrLink::evaluate() {
    if (tv == UnionTV) return tv->evaluate();  // tv is some formula
    else return tv;  // tv is just some numbers
} 

There already is a bunch of infrastructure to allow this second form to work correctly. See the code here: https://github.com/opencog/atomspace/tree/master/opencog/atoms/flow and the matching wiki pages and example programs (there are examples for all of these.)

The nice thing about the second proposal is that it needs fewer Atom types, and it allows Atoms to have different meanings in different contexts. It also might avoid disturbing userland code. I'm open to doing it either way: either UnionLink which is simple and dumb, or UnionValue which is more cutting-edge and experimental, but might be more powerful/flexible in the long run.

I would be fine with a simple and dumb UnionLink, that is because most calculations on my end are done with PLN anyway. Also the way it would be evaluated depends on the type of processing one is after. One may want to

  1. Calculate its TV from direct evidence
  2. Populate its members
  3. Infer its TV from indirect evidence

So which one of these 3 should be the default one is unclear, and I don't need to answer it since again I just using different PLN rules for different purposes.