RDF
In this section, we'll look at generating template output using RDF datasources. First, however, some background.
RDF, in mathematical terms, is a labeled directed graph. That means that RDF is a graph of nodes and arrows between them where each node and arrow has some label. Since it's a graph, arrows can point all over the place and nodes can have any number of arrows pointing out of them and pointing at them. And also because it is a graph, there is no real starting point or root node so you can just start anywhere. In the picture below, you can see that node A at the top has arcs pointing to B, C and D. Similarly, C has an arc pointing to D. You could have arcs pointing elsewhere--for example, node D could have an arc pointing back to A. To navigate around, you could start at node A and navigate around the graph following the arrows to B, C or D. Or you could start at B and go to A and then go to C and D, etc. No requirement exists to follow the arrows in the direction they point; you can easily go the other way (though in only one direction within a given series of iterations). The picture was generated from the W3C's RDF validator, a good place to go to check if your RDF is valid.
The text in red are the labels for the arrows, called predicates. In this example, all the arrows have the same label. Usually, this won't be the case. Templates provide a means of navigating around using only arrows with specific labels. Here is one serialization of RDF/XML for this graph, though there are many others.
<?xml version="1.0"?> <rdf:RDF xmlns:rdf="https://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rel="https://www.xulplanet.com/rdf/"> <rdf:Description rdf:about="https://www.xulplanet.com/rdf/A"> <rel:relatedItem rdf:resource="https://www.xulplanet.com/rdf/B"/> <rel:relatedItem rdf:resource="https://www.xulplanet.com/rdf/C"/> <rel:relatedItem rdf:resource="https://www.xulplanet.com/rdf/D"/> </rdf:Description> <rdf:Description rdf:about="https://www.xulplanet.com/rdf/C"> <rel:relatedItem rdf:resource="https://www.xulplanet.com/rdf/D"/> </rdf:Description> </rdf:RDF>
For a XUL template query, you first need to select a starting point in the RDF graph. Once you have selected a starting point, you use a number of statements which indicate where to go next when navigating the graph. Eventually, you will end up with a set of nodes you consider the endpoints of your query. These become the results and content would be generated for each of these results. Say you start at A. You could navigate to B, C and D and generate three blocks of output by following the arrows forward. Or, you could start at D and follow two arrows back (and only following arrows backwards). This will get one result, A. Look at the graph to see if you can see why one result would be generated in this case.
In XUL template terminology, the starting point is called the container or reference point and the endpoint is called the member. It is so called because it is most common to gather the list of the members, or children, of a container. But this doesn't have to be the case. Any starting point and ending points will do.
Nodes in RDF are identified by a string value. There are two types of nodes in RDF, resources which usually represent 'things', and literals which are values like the names, dates or sizes of those things, and so on. A literal's value is, for example, the name of the thing, such as 'Fred'. A resource's value is a URI which for your own RDF data you can just make up (though if you plan to use your model with others, it should be unique, preferably a URL for a site you own, so as to avoid future conflicts with mixing of other types). We'll use the URI of the resource nodes in a template. In the image, the resource URI's are the blue labels of each node. There are no literals in this example, but we'll see some later.
Let's say we want the starting point to be A from the above example graph. We will use A's URI (https://www.xulplanet.com/rdf/A) as the reference starting point. In a XUL template, you specify the starting point using the 'ref' attribute. Here is an example:
<vbox datasources="https://www.xulplanet.com/ds/sample.rdf" ref="https://www.xulplanet.com/rdf/A" flex="1">
This is an indicator that we want to construct a XUL template using the reference point with the URI 'https://www.xulplanet.com/rdf/A'.
Query Processing
A query for an RDF datasource consists of a number of statements, placed as children of the query
element. During query processing, the template builder builds up a network of information such as:
- possible results that are available
- where content should be generated
- information that indicates what to do when the RDF datasource changes
This network of information remains for the lifetime of the template, or until it is rebuilt. The template builder uses a method based on the RETE algorithm to match data. This allows for a fairly efficient means of updating results when, for instance, a new statement is added to the RDF graph. Rather than rebuild the entire template, the algorithm allows only specific parts of the network of information to be re-examined. A similar method can be used when removing RDF statements.
While the information network created by the template builder contains a number of different pieces of necessary information, for the purposes of this discussion, we will only be interested in the list of possible results. The builder begins with a single possible result, called the seed. The builder processes each of a query's statements in sequence. To do this for a particular statement, the builder iterates over the possible results found so far and either accepts each result or rejects each result. For the first statement, only the seed will be available as a possible result. At each step, new possible results may be added, or more information pertaining to an existing result may be added to the network. Naturally, a rejected result will be removed. Once all results have been examined, the builder moves on to the next statement in the query. Once all statements have been analyzed, any results which still remain go on to become matches. The matches are the endpoints and will cause content to be generated. So, to summarize:
- Start out with a one possible result as the seed
- Iterate over the results determined so far and augment them with additional data
- Add any new possible results
- Remove any rejected results
- Repeat steps 2 to 4 for each query statement
- Once done, all remaining results become matches
Each possible result is made up of a set of variable-value pairs. For instance, a result would look something like the following:
(?name = Fred, ?age = 5)
This result has two variables, ?name with the value 'Fred' and ?age with the value 5. Variables begin with a question mark, and values are RDF resources or literals. Here we will use strings for the values so they are easier to read. If we had two results, they might look like this:
(?name = Fred, ?age = 5) (?name = Mary, ?age = 12)
This is how we'll represent the potential results in this and the following discussions.
Later, we might have a statement which removes all Male results. So, our results after this might look like the following:
(?name = Mary, ?age = 12, ?gender = Female)
This statment has removed Fred from the potential results and added the ?gender variable for Mary. This is typical of how a query statement works, by adding additional variables to a result and filtering out those that don't match a particular value. If this were the last statement, Mary would go on to become a match to be displayed.