RDF Process Profile (RPP) is a lightweight extensible description format for processes. RPP is an XML application, conforms to the W3C's RDF Specification and is extensible via XML-namespace and/or RDF based modularization.
Mudman in the monsoon season.
Comments should be mailed to danny@panlanka.net
This Version is the latest : http://www.citnames.com/2001/04/rpp-schema.htm
Metadata is usually defined as being data about data. What is proposed here is to apply exactly the same techniques to describe processes, moving away from data-centricity towards a more general resource-centricity. RDF Process Profile is an attempt to standardize the description of data processors, and allow their processes to be represented in the same fashion as any other (web) resource. RPP is intended as a lightweight layer based on RDF(S) which will allow simple description of a process, on top of which more sophisticated layers can be built. The processes may be available online, though RPP is intended to be appropriate for both online and offline resources. To achieve the maximum applicability of process description wherever possible the processes will be identified through reference to their algorithm(s), though in practice it is anticipated that for most processes the algorithms themselves will not be available, rather an implementation of the algorithm. The goal is to provide in a RPP document descriptions of all the resources required to carry out the data processing defined in that document. RPP definitions may be defined in terms of other RPP definitions, allowing multistage/multilayered process definition. Terms defined in other schema may be included to extend the functionality of a RPP document into other domains. The use of a standard metadata format (RDF/XML) should enable advertising and lookup/discovery of the processes described in RPP documents. It is hoped that the RPP format will provide a suitable base layer on top of which other facilities required by online services, for example process leasing, chain of trust and security management can be built.
Where a language like DAML allows data to be marked up for agent's consumption, RPP will describe the agents themselves so they may be fed the right stuff.
It is likely that existing vocabularies contain terms that could be used in place of those described here. It is believed that in terms of interoperability that it will be advantageous to encapsulate process-specific metadata in a single vocabulary, such as RPP. If considered appropriate links to other vocabularies may be added later in the form of properties such as daml:equivalentTo.
An alternate view of RPP would be that of enabling a meta query system. We have some data that needs processing or a requirement for data that is specified in metadata. We supply this to a system containing marshalling facilities and an inference engine, with access to RPP repositories. The inferencing required is little more than matching the conventional metadata with an algorithm described in RPP and then the data and algorithm could be marshalled to an appropriate processing host and the operation carried out. There may be data returned from the query but this need not always be the case - e.g. the state of an external system may be modified.
4.1 Algorithm
Within this document this word is primarily used in the dictionary sense of
a process or set of rules used for calculation or problem-solving, though
the range of entities described by the term here extend from very abstract
procedures e.g. 'draw a fish' through more canonically expressed forms such
as C source code, also encompassing black box data processors.
The actual detail of the description of an algorithm is in many respects not significant, as long as there is enough information for an inference engine receiving a RPP document and a set of metadata to be able to decide (through reference to external resources as necessary) whether or not the algorithm described can be realized in a form that can carry out the required processing of the data the metadata describes.
4.2 Process*
The entities being described by a RPP definition will be referred to as
data processors or processes, with no direct relation to XML processors. Within
this document the terms process and algorithm are used interchangeably, which
is sloppy as in this context they may not refer to the same thing.
Informal prose is suggestive. Formal specification non-lucrative...
RPP follows the conventions for the RDF/XML syntax and model described in RDFMS. Additional elements are as follows :
|
Classes |
Properties
|
Pretty picture from the wonderful RDFSViz :

Here's another view of the schema (even prettier picture!) from Protege using the OntoViz plugin (many thanks to Michael Sintek).
// I'm not at all sure about the scoping - my general feeling was that it would be most useful for the attributes to have there values described by reference to external documents, though additional local (& literal?) support might be more appropriate.
This resource - name and location (not necessarily the same as location property)
Objects without which the process cannot operate.
An identifier for the algorithm of the process - may be name or reference to source code etc.
A description of a data format. Typically this will be the URI of a schema. The schema may be a DTD, XML Schema, human language description or other type.
The low-level encoding of the data to/from the processor
The location of the process described by RPP (e.g. the URL to POST to for online processing or the URL pointing to an executable binary file).
Defines characteristics of the process in its role as a consumer of data. The domain and range of the data the algorithm consumes will be defined. A RPP definition of an algorithm can contain any number of input values, the only constraint being that there is sufficient description to fulfil the requirements of the rpp:availability property.
// Big holes - mind your step
This version : http://www.isacat.net/citnames/rpp.rdfs
|
|
// I know this is lousy - in terms of syntax & content (and I'm not even sure about the font), but I thought if I put this in at least it's a start - any flames I get should help the learning process ;-)
|
|
The general idea here seems reasonable to me - not distinguishing between the metadata of program and data just seems like updating von Neumann a bit.
The aim mustn't be confused with any kind of formal notations - this isn't about making rock-solid formalisms, just providing enough info to be able to use a process.
Detractors may (hopefully) say that RPP is a gross oversimplification - the goal is to simplify down to the barest minimum needed to do the job.
The first few RPP documents will be the hardest - once a document has been built to describe e.g. Python, identifying this will be adequate for RPPs of processes that use Python (rdfs:isDefinedBy).
It'd be nice to have some clear indication of when the end result of a process is a graphical representation, but I couldn't think of a way without it seeming overly arbitrary.
The way CC/PP wraps up attributes in a 'Component' object appealed to this code junky - seems like a good sub-pattern of 'Profile'.
Given that the primary context for RPP is the web it has not been mentioned here that it would be desirable for it to be possible to create a RPP document for any given process - e.g. how an egg is boiled. This may be possible with RPP as defined in this document, however this hasn't been put to the test. Hopefully the next version of this specification will take this more into account.
Need to an 'owner' property (for use with agents)
Need to a pointer to documentation - rpp:rtfm ?
DTD for instances
run it by rdf-interest & xml-dev
change my name, buy a mask & leave the country (again!)
© 2001 Danny Ayers All rights reserved.