Forward Chaining with the Jena Rules Language

A typical "business rule"

Javagate, the component of Real Semantics that moves data between RDF and prexisting Java classes. If you are writing classes intended for use with Real Semantics you can set up this mapping by putting Java annotations on your code. If you're using other classes, Real Semantics needs a map that connects RDF properties to Java members.

The good news is that Real Semantics can generate this map automatically in most cases, because most Java programs use design patterns such as the Java Beans convention:

By default, we use design patterns to locate properties by looking for methods of the form:
    public <PropertyType> get<PropertyName>();
    public void set<PropertyName>(<PropertyType> a);
If we discover a matching pair of “get<PropertyName>” and “set<PropertyName>” methods that take and return the same type, then we regard these methods as defining a read-write property whose name will be “<propertyName>”. We will use the “get<PropertyName>” method to get the property value and the “set<PropertyName>” method to set the property value. The pair of methods may be located either in the same class or one may be in a base class and the other may be in a derived class.

If we find only one of these methods, then we regard it as defining either a read-only or a writeonly property called “<propertyName>”

— Section 8.3.1 of JavaBeans(TM) Specification 1.01

This kind of specification is a good match for rules technology because it describes a number of arbitrary properties. This kind of specification can be broken down to a set of rules which state what the English says in a rigorous way. It would be asking a lot to turn that English into rules automatically, but you can definitely display the rules side by side with the specification to confirm the correctness of the rules.

Anatomy of a Rule

For instance, while building the RDF to Java mapping, Real Semantics scans Java classes for metadata about fields, methods and annotations. This data is inserted into an RDF graph, and a set of rules that look like the following match the patterns described in the specification above. This specific rule finds setter methods and configures property accessors to fetch data from them:

@prefix : <http://rdf.ontology2.com/javagate/>
@prefix unq: <http://rdf.ontology2.com/unqualified/>
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

[
    (?A rdf:type :Method)
    (?A :memberName ?memberName)
    (?A :returnType ?javaType)
    (?A :hasModifier :Public)
    (?A :parameterTypes rdf:nil)
    regex(?memberName, '^get([A-Z].*)$', ?propName)
    lcFirst(?propName,?lcPropName)
    uriConcat(unq:,?lcPropName,?propertyIs)
    makeTemp(?B)
->
    (?B rdf:type :Accessor)
    (?B :propertyIs ?propertyIs )
    (?B :getter ?A)
    (?B :javaType ?javaType)
]

The structure of this rule matches the structure of the specification well, so it is easy to implement variant rules such as the special case of isBooleanVariable, indexed properties, etc. Thus we can get data in and out of Java Beans as well as classes that use other conventions to name entities and fields.

Let's talk about this rule, which has the structure

[ *body* -> *head* ]

and means more or less

    IF *body* THEN *head*

First note the rule is made of two sorts of things, (i) triples, which look like (?A rdf:type :Method) and (ii) builtins, which look like makeTemp(?B). Triples in the body match patterns in the RDF graph that are already known, while triples in the head get inserted into to graph when the inference rule is triggered. Builtins do many other things, and are similiar in nature to the "predicates" used in the Prolog programming language.

The first four triples in the head match five key properties of ?A, which represents a Java method. The fifth one refers to rdf:nil, which is also known as an empty list, and is making a precise statement that the parameter list of the function is empty. (This might not be familiar if you've seen legacy RDF systems that don't support ordered collections.)

Jena Rules Builtins

The snippet regex(?memberName, '^get([A-Z].*)$', ?propName) is possibly the first thing which is a little unusual. In most programming languages you would write this function like this:

    ?propName=regex(?memberName,'^get([A-Z].*)$')

and perhaps a future version of the Jena Rules Language could let you write it this way, but instead the Jena Rules language treats the return parameter as just another parameter in the parameter list as opposed to something that is applied to the left. Because of this, Jena Rules programs have a graph structure instead of the typical tree structures found in most programming languages. Other than that, the method is utterly conventional in that it matches get followed by a capital letter, and deposits the matching name into the ?propName variable.

At this point (imagining that execution proceeds downward) we are done matching and now we are computing values that we will insert into the graph via the head. Actually Jena can match and execute the clauses

Forst, lcFirst(?propName,?lcPropName) is a user defined function (UDF) that Real Semantics adds to the Jena Rules engine which makes the first letter of ?propName lowercased and inserts that string into ?lcPropName Real Semantics comes with an extended library of user defined functions that solve practical problems like this. As much as it would be desirable to express ourselves in rules, often the exact logic we need can be found in a function written in Java, in which case this is made as easy as possible.

The last two builtins come from the standard library and finish the job, uriConcat(unq:,?lcPropName,?propertyIs) and makeTemp(?B). The first one creates a URI ?propertyIs by using unq: as a namespace and ?lcPropName as a localname. The last one creates a new blank node ?B which is a unique name for the data record to be created in the head.

If you look at the above and squint you might see some similarity between that rule and a SPARQL CONSTRUCT query, particularly in that the structure of the query is

CONSTRUCT *head* WHERE *body*

Something that the Jena Rule shares with the CONSTRUCT query is that you can't do any computations in the *head*, for instance you can't use a function like lcFirst, instead, you have to do all of your thinking in the *body* and pass the results through variables. One thing that is different about Jena Rules, however, is that you can use builtins in the head that have side effects such as print() and hide(). Creating such a set of builtins is a natural way to express a Java API that constructs Java objects to have effects on the world.

We are contributing a patch to the Jena Project, JENA-1204 which makes it straightforward to add locally scoped UDFs to Jena and are working on JENA-1201 to improve the function library.

Forward Chaining with the Jena Rules Language

Context

A typical "business rule"

Anatomy of a Rule

Jena Rules Builtins

Jena Rules vs SPIN