RDF

The RDF Resource Description Framework is a technology for knowledge representation. It is designed to be understood by people and processed by computers.

An ontology is a formal description of knowledge. It lists the types of things that exist and the properties that are used to describe them:

An ontology is concerned with general knowledge about classes of things; a knowledge graph adds data about individual objects belonging to those classes.

Knowledge graphs tend to be much bigger than ontologies; e.g., a book ontology is concerned with general concepts about books and not individual instances. The ontology describes books, authors, publishers, and their relationship, while a book knowledge graph contains data on individual books i.e. their titles, authors, year of publication.

Both ontologies and knowledge graphs are represented as graphs. A graph consists of nodes and edges and is easy to view in an image, provided the graph is not very big. With increasing size images become less useful.

RDF, the Resource Description Framework, is commonly used as a method of specifying ontologies and knowledge graphs. A collection of statements describes the graph, each with a

Since there are three elements such statements are also called triples. The predicate is also called verb or property.

RDF is usually written in XML, the Extensible Markup Language; however, there are other formats for RDF data, since XML is fine for automated processing but very tedious for human readers. Note that XML is also used for many other types of data, not just RDF.

Turtle and N3

Turtle is a more readably format for RDF data. It is also a subset of N3, described in the semantic web Primer:

w3.org/2000/10/swap/Primer

RDF uses URIs to indentify objects (Uniform Resource Identifier). URIs look very much like URLs. However, here that format is just used to uniquely identify objects, such as

http://example.com/people#tom

Leaving out everything before the hashtag # identifies <#tom> in the current document. In Turtle/N3 we can write:

<#tom> <#knows> <#jane> . <#jane> <#age> 28 .

The meaning for the human reader is obvious.

While subject and verb are stated in URIs, the third part of the statement can also be a literal, such as a string or integer. Note that #age acts as a property, while #knows acts a relationship.

Namespaces

The identifiers we used work just fine in our own document, but when we process data from different sources there may well be a name clash: the same name is used in another source in a different way. On the other hand, we do not want to always write the full URI in our statements.

Namespaces solve the problem of name clashes. In the following statement the meaning of the word 'title' is clear for fantasy fans since lotr commonly means Lord of the Rings in a fantasy context.

<#lotr> <#title> "Lord of the Rings" .

However, the next statement also uses the word 'title'

<#tom> <#title> "Managing Director" .

For the human reader the term 'title' in this context is again clear to us. However, out of context even the word 'title' can refer to a number of things:

To clarify our meaning of terms we can use a namespace, such as the Dublin Core, a small vocabulary to describe resources:

@prefix dc: <http://purl.org/dc/elements/1.1/> . <#mydoc> dc:title "Some N3 Examples" .

Let's add another prefix for the FOAF vocabulary (Friend of a Friend):

@prefix foaf: xmlns:foaf="http://xmlns.com/foaf/0.1/" . <#tom> foaf:name "Tom" . <#jane> foaf:name "Jane" .

We are using a pre-defined vocabulary identified by the prefix foaf. There are a number of well-known vocabularies:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl: <http://www.w3.org/2002/07/owl#> .

The empty prefix refers to this document, which can be specified as

@prefix : <#> .

Defining an Ontology and Adding Instance Data

In addition to providing a vocabulary an ontology usually provides a type hierarchy and restrictions for predicates. First we want to specify types i.e. classes for things.

:Person rdf:type rdfs:Class .

Since this is so often done there is a special keyword in N3 acting as a shorthand for rdf:type, simply 'a':

:Person a rdfs:Class .

Now that we have defined a class for people we can add instances to that class:

:tom a :Person . :jane a :Person .

RDFS provides a number of vocabulary elements to specify details for classes and properties, such as hierarchy:

:Man a rdfs:Class; rdfs:subClassOf :Person . :Woman a rdfs:Class; rdfs:subClassOf :Person .

Now, when we say that

:martha a :Woman .

it follows logically that

:martha a :Person .

When that logic is implemented we can automatically make such inferences. Similar inferences are possible from property restrictions:

:brother a rdf:Property . :sister a rdf:Property . :brother rdfs:domain :Person . :brother rdfs:range :Man . :sister rdfs:domain :Person . :sister rdfs:range :Woman .

These statements provide information on properties:

Given these definitions we now make the following statement:

:martha :brother :albert .

This implies:

:albert a :Man .

This is clear to us human readers, not the machine. Just like the type hierarchy the logic behind domain and range must also be implemented in software for these inferences to be made. This type of software is known as a reasoner.

EXERCISES: