The Epistomat: Generating Consensus About RDF Ontologies and Rules

Jo Walsh

University of Openness

Schuyler Erle

O'Reilly & Associates, Inc.

Abstract

What happens when you take an ontology-driven interface generator, and feed it the ontology for OWL itself? A simple recursive technique allows for bootstrapping a world model and programmatically generating interfaces to it. Such a model can then serve as the platform for the evolution of collective consensus about ontologies and statements, by means of a curious game of Nomic.

Keywords

Ontology driven interface, epistemological agents, bot arbitration, knowledge representation

Contents

Introduction

As semantic web applications become more pragmatic, questions about ontology merging arise with more frequency. How will a user of the system be able to refine their worldview, accept certain statements from some proposed ontologies, but not others? How will they be able to collaborate and maintain their ontological commitments, which may need to change with their world view over time?

We describe a set of application layers, built over a HTTP-based RDF/XML application framework. There is a small 'ontomatic' at the core; a client interface providing awareness of RDF Schemas and a subset of OWL semantics, which is used to generate several interfaces to the application. Each of these interfaces can then be used to query and manipulate the model.

Following the example of Nomic, a game of self-amendment, the ontomatic is employed as a basis upon which groups can fashion consensus about collected statements and ontologies by means of metastatement, e.g. voting, acceptance states, and commentary. We will look at how this emergent 'epistomat' can be used to model individual and group world views, bootstrap the consensus process, resist common forms of 'semantic attack', and facilitate the generation of meaningfully networked views of the Web.

Generating Interfaces with RDFS and OWL

By definition, ontologies specify the way that people interface with information. An ontology outlines the kinds of statements we can make about objects in a particular domain, directing and constraining the ways we interact conceptually with those objects. One can easily take this a step further, and imagine building user interfaces by simply mapping form fields or prompts to the kinds of statements an ontology suggests we can make about objects of a given type. Interfaces for creating and amending statements about those objects can then programmatically emerge from ontological metatdata about the RDF classes in question.

From this approach, we can easily imagine an interface to any RDF model, given an OWL ontology, or rather a selection of annotated classes and properties from different ontologies, which provide a 'view' into the model. This view can then be used to automatically generate interfaces to the model in a web browser, an 'instant message' conversational interface, a GUI description language such as XUL, or other potential targets. We will refer to the type of system that provides such a user interface to an underlying RDF model as an 'ontomatic'.

the core model of the ontomatic

A basic RDF model used to bootstrap the ontomatic.

The ontomatic itself can be bootstrapped from a small 'core model': a tiny subset of RDF and RDFS semantics sufficient to describe the basic features of OWL; a kind of RDF-Tiny. An open installation with data from the wild, subject to a collaborative process, would likely want a more complete awareness of these schemas, though we may not want to see every detail in the interface, or use them all in inference augmentation attempts.

We then seed the ontomatic with knowledge of basic common ontologies - FOAF, spacenamespace, iCal, Dublin Core, ISO vocabularies for countries and world cities, as well as the McC vocabulary describing organisational networks and structures, which we are in the process of creating. We are currently using the ontomatic to manage and publish this last ontology collaboratively, using the web interface. This model will later form the basis of a 'microtheory', or 'worldview', that a person or group or bot can subscribe to, in order to filter new statements, or ask certain questions about the world.

Thus we build up aspects of ontologies that we like, or that are meaningful to us, and connect them together. The same interface allows us to edit and add subject-verb-object statements, and the drop-down lists for suggestions of new properties are populated by simple reasoning on behalf of the model - for example, if a verb 'isInCountry' has a domain of iso:Country, then we should populate the list with the labels of all things of type iso:Country, and so on. Additional hints about interface styling or preferences can be provided by further annotation of the given model's ontologies - and, once bootstrapped with a core model, these embellishments can be added via the ontomatic itself.

Next, we can provide an stateful conversational interface to the ontomatic, by way of a generic 'ontobot', accessible via Internet Relay Chat or an Instant Message protocol. During process startup, the bot consults an annotated ontology at a URI; this can be any ontology as long as it contains an rdf:label for each class and property - this is used as short-hand for the bot. In this application, the bot consults the ontology at a URI which maps to a method on the RDF/XML HTTP API; alterations made to the model can thereby be instantly reflected in the interface to it.

The bot has a simple conversational grammar, patterned insofar as is possible on a given natural language, in which different sentences are mapped to different requests to the web interface. The base bot is designed for exploring, annotating and editing the RDF model; the Nomic bot we describe below extends this capability to allow voting on, and expressions of opinion about, statements in the model. The selection of classes and properties from the many ontologies available from the ontomatic can then filter the user's view of what they can create (and possibly search for).

In developing an HTML interface to the model, an essentially similar approach was taken, by connecting the web application to the RDF/XML web service on the same server. The OWL ontology is annotated with appropriate labels and comments for each class and property, and the domain of the properties determines the classes to which they apply, and thus how they are to be displayed in an editing or a constrained browsing interface.

A Game of Nomic

Nomic is a game with some number of players, which is played by iteratively re-writing the rules of the game. In its original conception, a game of Nomic commences with a set number of 'immutable' rules, one of which being that immutable rules cannot be changed. The version of Nomic which we play and attempt to apply to voting on ontologies is referred to as 'Pure Nomic', in which no rules are immutable, and play opens with as few rules as possibly enable a functioning system.

To this end, we collaborated on the development of a Pure Nomic bot for IRC, which keeps track of proposed rules, and allows voting and commentary on them. The object is to propose rules or emendations to rules and get them accepted; there may be a point-scoring and 'winning' threshold (as in traditional Nomic), but our original version was more of an experiment in participatory rule-based democracy amongst a small group, and a search for rules that would reinforce the continuation of the game (as Pure Nomic games tend to stabilize and 'end' after a while).

In Nomic, one votes on rules containing descriptions of the game, or of other things in the world, and which may refer to each other in ways like "this rule supersedes rule 6 and adds this coda."

A simple graph model for a Nomic vote.

Consider, then, if we are voting not on full text rules but on statements expressed in OWL, or on inference rules expressed in a higher level language.

If we choose to vote directly on statements made in an OWL ontology, voting on Classes and their Properties, we need a way to consider them in collections, to be able to "agree with all statements with this subject" without having to vote through one by one, losing a sense of the bigger picture. We may wish to define closures based on common patterns in a model or a domain; we may wish to use the OWL schema itself to suggest entailments and implications of statements. Equally, we need to provide useful interfaces for ontology traversal and graph visualisation closely tied to this voting process.

A Voting Model for Ontologies and Rules

An RDF graph model suggesting an implementation of votes on statements using reification. The interface allows for acceptance or rejection of statements in batches, where appropriate.

Each vote within a model essentially corresponds to a statement of belief about another statement or subgraph, a confirmation or refutation of it. Our game of Nomic eventually postulated seven states which an agent can position themselves in relation to a question, ranging from complete agreement, general agreement, partial agreement, neutrality, partial disagreement, general disagreement and complete disagreement. A combination of these states can be used to compute a threshold for acceptance or refutation of a particular statement.

What happens if a statement is contained in one subgraph that you agree with, and in another that you don't? In this event, a participant would have to propose an intersection - or have a game heuristic that provided an intersection - as a new rule, and it would then be that player's responsibility to convince other players of its consistency. Ideally, these propopsals should be expressed in some logical form with an unambiguous interpretation.

We can then filter our view of the ontology to only include those elements about which everyone in the group is in complete accord; or filter to only include statements confirmed by one person - essentially a microtheory reflecting a person's system of beliefs - or the intersection of any set of people, perhaps drawing those sets from enumerated FOAF networks. Thus the application is looking more like an epistomatic than an ontomatic.

We can also develop a voting system, based on opinions about queries as expressed in UL-1, a small RDF query and rule language developed by the authors with this application in mind. UL-1 is a second-generation RDF query language sharing some characteristics with sesame's SeRQL, with composability as a principle design aim; it should be easy to read and understand in our Nomic application. UL-1 statements are stored in the model as strings, and can be called up and 'enacted'.

UL-1 can express assertions of OWL statements, and also express 'constructs', or sub-graphs of the model. This way, we can vote on whole sub-graphs, or make refinements of them as queries expressed in UL-1. These queries are then run for demonstration in a voting interface, providing a visualisation of the model. Results are returned as a graph, usually serialised in RDF/XML.

We have not essayed an implementation, but strongly feel a Nomic voting system would be perfect for RuleML, or a similar rule language with an RDF/XML graph representation. We can run the rule subject to vote in-memory and demonstrate the new graph returned by its production. Votes which are enacted and which have entailments immediately become applied to the model. We posit being able to describe the mechanics of the game - "This statement can only have the property 'approved' if more than 75% of votes are in agreement and none in more than slight disagreement" - which would immediately propagate into the rules in play.

Immediately, we might try this as an interesting way to generate consensus on a collaborative ontology - providing some discussion mechanism through IRC, Jabber networks, email lists, web discussion forums, and so on - and as a side effect, to be able to generate a query interface into that, or any other, ontology.

Ontomation and the Semantic Self-Defense

The ontomatic allows a group of users to add and manage OWL ontologies interactively, or to import them from URIs. There is a third and interesting option; that of running a 'scutter', an RDF spidering bot, collecting triples from the wild. A near-future related option would be to accept a stream of triples from an RDF aggregator providing a scuttering service; however, these RDF aggregators provide scuttered data that may well have already been subjected to a 'smushing', or some other inference augmentation process.

Naturally, we don't want to take everything our scutter comes across on trust, and immmediately attempt to augment our model with it. There are many ways a reasoner or a scutter can be subjected to 'semantic attack': by misuse of commonly implemented OWL rules, e.g. false inverseFunctionalProperties to generate false sets of joined statements; or of false subClassOf and subPropertyOf assertions, to create lots of spurious attributes in the interface; or the far more trivial attack of simply asserting a lot of factual lies or gobbledegook data in RDF.

We have several strategies for maintaining the conceptual integrity of a model, whilst opening it up to free input 'from the wild', based on the workings of the ontomatic and the epistomat. Though developed to deal with the particular pressures of a scuttered data store, they apply equally to an internal-use RDF-based system with a reasoning component and human access to predicates which affect the operation of rules.

One important tactic is to keep a 'ground model' of every node and arc in an RDF graph encountered by the interface. In our implementation, the storage consists of a simple table containing every node, and a second table of triples alongside context, timestamp, and some search keys. Every statement we ever hear is recorded in it and never removed; no smushing or other normalising changes to the model based on the operation of OWL rules is contained in this model. Retractions are simply recorded in kind as a metastatement. Our candiate statements for OWL constraints and other rules that we will accept is also expressed within this 'ground' repository. It turns out this representation proves useful for versioning, and for applications which expect frequent changes in information available from many single sources, such as a FOAF network, or a news aggregator. When we want to generate an RDF model based on an ontology, or when we want to generate an ontology based on rules about it or votes regarding it, we can trivially extract a sub-graph for the present or any time in the past, and then augment the selected sub-graph with the then-accepted inference rules.

Running a scutter on top of the data store, or an 'infomesh' behind the ontomatic, we maintain two separate RDF models based on the ground store - one representing the ontologies from which the interface rules are constructed, and another containing the general 'world model' of the application. Naturally, all triples with predicates in the RDF, RDFS, or OWL namespaces, and all statements containing their subjects or objects should go in the ontological model, while all other statements are placed into one or more general models. It is these latter general models against which most application queries will be performed in the course of ordinary use.

In essence, we maintain thereby a separation comparable to that of 'mutable' versus 'immutable' rules in traditional Nomic. For an internal or critical system, such as a contacts management interface, we can even go so far as to keep the ontology of the world model completely constrained, and possibly even maintain it not in the store but in an external file at a URI. Access to instance data is managed by authentication, and there is provenance for every statement and any subsequent retraction or restatement.

In this event, we would want to be able to gather assertions made in OWL and RDFS but not statements about the OWL and RDFS vocabularies themselves - a OWL-DL-style restrictive approach to what we're prepared to infer. The new OWL statements have to subjected to some kind of approval process before they can become real assertions that we are happy to repeat. Each individual subject URI in the model has a status, as status of terms is marked in the FOAF vocabulary, with an option to just not use or publish them.

We consider the following strategies for arbitrating RDF statements, including OWL statements which, if used to augment our model, may effect interesting changes on our world.

The final step we envision involves building a partially or totally automated aggregator of public opinion, by querying distributed models via a simple HTTP-based UL-1 or other query protocol, for triples related to trust in statements, and building an ontology model out of what we can scutter thereby.

If we are to perform arbitrary acts of 'buy-in' to statements about ontologies that we find on the 'Net, or we are told by people we don't know, we need a strategy for generating consistent epistemologies within models. The need to protect ourselves from semantic attack was an important motivator for our base assumption: we are keeping one enormous 'ground' model for the ontomatic which consists of every RDF statement that has ever been asserted to us, with provenance for it, but with no local 'smushing' or inference enhancement having been performed, and use this basis to generate 'sub-models' according to different views determined by ontological commitments.

To phrase this in terms of logic programming theory, we can state that the contents of this triplestore, including the OWL meta modelling statements, and all the predicates available to any query language and reasoner we run on top of it, represent the 'Herbrand universe' of our shared world model - all the symbols, ground, variable, or functional, available to us to combine. The potential actuation of all available inference rules over the triplestore is the Herbrand base.

Of course, many or even most statements in the Herbrand base will be unprovable except by the wildest configurations. An "implementation" of the Herbrand base will be a set of ontological contraints and inference rules that when applied will express a consistent if possible not coherent worldview. A new model, augmented and compressed, generated from this implementation at a point in time, we'll refer to specifically as a "worldview", mostly to avoid conflating the logical and the RDF-related senses of the term "model".

Our game of Nomic comes in very useful now, as a source of meta-statements from which to generate candidate worldviews. We have some facility in generating different kinds of consent, which can be expressed in the rules of the game itself - rules about what percentage of votes have to be in a certain state for a rule to be accepted or rejected, for example. We can attempt to use one or several of the Nomic bot's native ideas of consensus to bootstrap the discussion - a provisional consensus worldview. By looking at the histories of individual voters, we can extrapolate a personal epistemology given the context for a query. Similarly, we can look at how different members or sub-groups of a group will cluster together over arbitrarily chosem key issues, and use this to generate sets of statements localised to the context of a given sub-group.

The Epistemological Congress

By augmenting the ontomatic with a model of consensus and principled hesitance, we can have it generate ontologies and rule sets that represent a particular world view - that of any group of one or more individuals in the system, including treating a group consensus view in the same way as an individual, allowing for a blunt federation of this system. We can use each ontology to generate a complete 'worldview' based on the ground model, possibly selecting only the view of classes and properties that it describes, and from there derive an full inference-augmented model, according to our worldview's prescriptions.

An actual storage strategy for such a system would be dependent on a less experimental use case than our own. For example, it will likely make sense to generate and incrementally maintain an augmented model for each individual, and perhaps for the general consensus according to the current rule state of the system. Meanwhile, other cross-sections of opinion could be generated afresh in response to a request, unless that request is frequent. One might liken this to the operation of indexes and views in SQL databases.

Note, we still don't have any way of proving that each world view will be consistent, nor do we yet have a metric for establishing 'relative' consistency between them. Our aspiration is that each worldview generated be at least sound, but probably incomplete. Further questions about correctness based on a process of consensus are absolutely fascinating in and of themselves, but more philosophical than the scope of this paper allows.

Conclusions

We believe that consensus negotiation of ontologies and rules - between business partners, or public service providers, to be briefly practical - is a critical and underexplored area. Despite the potential of topic-based trust mechanisms and collaborative reasoning systems, we believe that overt human feedback about belief is indispensible especially at this stage of the development of the semantic web.

Nomic is a way of gathering those statements which is 'fun', and indicates the potential for an 'epistomat', a small but fully self-describing system, given a sufficiently composable rule language that operates directly on the application logic. Using the ontology composed by the Nomic game, we can try out different 'perceptions' - we hesitate to say 'belief systems' - represented as differently augmented graphs inferred from the same underlying base graph.

We also offer the consensus system provided by Nomic as an extra means of protecting a knowledge base from semantic attack, either from bad instance data or bad rules. We are very curious to see what graphs and tendencies emerge as these expressions of group consensus are aggregated both within and across larger organizations, and then across the Web at large.

Acknowledgements

The authors would like to thank the members of the #bots community on irc.perl.org for their support, participation, and feedback.

References