RDF 1.2 Properties of Formal Structures

The purpose of this document is to define semantic extensions for interpreting the formal structures of propositions and literal values, as defined in [[[RDF12-CONCEPTS]]]. In addition, examples of semantic interoperability patterns using OWL in conjunction with these extensions are provided. This assumes knowledge of the basic concepts of RDF, and familiarity with reifiers, as exemplified in the [[[RDF12-PRIMER]]]. It furthermore relies on the semantic conditions defined in [[[RDF12-SEMANTICS]]]. The section about OWL-based Interoperability patterns assumes an understanding of [[[OWL2-RDF-BASED-SEMANTICS]]].

Table 1: Prefix and Namespaces used in this specification
prefix	namespace IRI	definition
rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#	The RDF namespace [[RDF12-CONCEPTS]]
rdfs	http://www.w3.org/2000/01/rdf-schema#	The RDF Schema namespace [[RDF-SCHEMA]]
xsd	http://www.w3.org/2000/10/XMLSchema#	XML Schema Namespace [[XMLSCHEMA11-2]]
owl	http://www.w3.org/2002/07/owl#	The OWL namespace [[OWL2-OVERVIEW]]
(others)	(various)	All other namespace prefixes are used in examples only. In particular, IRIs starting with "http://example.org" represent some application-dependent IRI [[IRI]]

Properties of Formal Structures

In RDF, resources are regularly denoted by IRIs, or existentially quantified by blank nodes, and further described through statements about them, rendering their definitions more precise. This is not necessary for propositions and literal values, since their meaning is completely encoded by their corresponding terms—ground triple terms and literals—in the abstract syntax.

Propositions are fixed, invariant formal abstractions within the class extension denoted by rdfs:Proposition, having three intrinsic constituent components: the subject, predicate and object of the proposition. This structure forms a precise conceptual identity, corresponding to the triple terms, which are 3-tuples of RDF terms, acting as a compound key denoting the proposition through its structure within the class extension.

Formally, blank nodes denote some resource, thus a triple term with a blank node denotes some proposition; though functionally dependent on its constituents.

Literal values, while fundamentally different concepts (some defined outside of RDF), share this fixed, invariant characteristic. One difference is that it requires its datatype to be recognized (to be in D), and if it does, it is fixed in all interpretations recognizing it.

Therefore, while these kinds of resources are conceptually just resources in the domain of discourse, they have a particular standing due to their formally fixed definitions. The RDF abstract syntax reflects this by only allowing literals and triple terms as objects of other triples.

Still, these invariant resources logically have properties. They are simply intrinsically defined, as an interpretation of the structural constituents of the resource within the space of their type.

In the following sections we will see how these facts can be formally entailed, and encoded using basic statements. The semantic extensions defined are based on the semantic conditions of RDFS interpretations, defined in [[[RDF12-SEMANTICS]]].

The [[[SPARQL12-QUERY]]] defines a number of functions on RDF terms. In principle (and noting that functions—including those on IRIs—operate on lexical forms), the unary functions defined there can correspond to predicates of the formal structures which these terms denote (in the logical sense, and expressed in RDF as statements). More precisely, if F is a SPARQL function and P is its associated predicate, for all RDF terms S on which F is defined, the statement S P F(S) must be true.

Formal Structure of Propositions

An rdfs:Proposition instance is precisely defined as a resource with the form of a triple, having exactly one subject resource, one predicate property and one object resource.

The proposition pattern:

XXX YYY <<( AAA BBB CCC )>> .

can thus be described as:

XXX YYY _:pp .
_:pp rdf:tripleSubject AAA .
_:pp rdf:triplePredicate BBB .
_:pp rdf:tripleObject CCC .

Note that the following is already entailed through RDFS semantics:

_:pp a rdfs:Proposition .

And this, provided the axioms below:

_:pp a rdf:TripleForm .

Extended Proposition Interpretation

An extended proposition interpretation is an RDFS interpretation I which also satisfies the following semantic condition, and all the triples in the subsequent table of axiomatic triples.

Proposition semantic condition.
If I(E) is in ICEXT(I(`rdfs:Proposition`)), and I(E) = RE(I(s) ,I(p), I(o)) where s is the subject, p is the predicate and o is the object component of E, then: <I(E), I(s)> is in IEXT(I(`rdf:tripleSubject`)), <I(E), I(p)> is in IEXT(I(`rdf:triplePredicate`)), and <I(E), I(o)> is in IEXT(I(`rdf:tripleObject`)).

Axiomatic triples of proposition forms.
rdf:TripleForm rdf:type rdfs:Class . rdf:tripleSubject rdf:type rdf:Property . rdf:tripleSubject rdfs:domain rdf:TripleForm . rdf:tripleSubject rdfs:range rdfs:Resource . rdf:triplePredicate rdf:type rdf:Property . rdf:triplePredicate rdfs:domain rdf:TripleForm . rdf:triplePredicate rdfs:range rdf:Property . rdf:tripleObject rdf:type rdf:Property . rdf:tripleObject rdfs:domain rdf:TripleForm . rdf:tripleObject rdfs:range rdfs:Resource .

Axiomatic triples of proposition forms.

rdf:TripleForm rdf:type rdfs:Class .

rdf:tripleSubject rdf:type rdf:Property .
rdf:tripleSubject rdfs:domain rdf:TripleForm .
rdf:tripleSubject rdfs:range rdfs:Resource .

rdf:triplePredicate rdf:type rdf:Property .
rdf:triplePredicate rdfs:domain rdf:TripleForm .
rdf:triplePredicate rdfs:range rdf:Property .

rdf:tripleObject rdf:type rdf:Property .
rdf:tripleObject rdfs:domain rdf:TripleForm .
rdf:tripleObject rdfs:range rdfs:Resource .

Formal Structures of Literal Values

Literals are also denoted by tuple structures, specifically of an IRI denoting their datatype, and a string representing a lexical form. They are fundamentally different from propositions in that their lexical form may encode values of any kind of complexity, depending on its datatype. The value so encoded is defined by the formal meaning of this datatype, which defines its value space. The result of decoding the lexical form according to this datatype is the literal value, having the datatype as its rdf:type, and other properties according to the formal meaning of its lexical encoding. It should be noted that such complexity is outside of the formal model, as that resources denoted by literals are considered as fixed points in the graph.

Some value spaces are formally specified in D-interpretations of [[RDF-SEMANTICS]], and in [[[RDF-PLAIN-LITERAL]]]. The following subsections extend these definitions by entailing statements about the literal values.

Directional Language Strings

A directional language string is a typed resource combining one string value with one rdf:language and one rdf:direction. This form acts as a key, uniquely identifying the literal value.

The directional language string pattern:

SSS PPP "..."@und--rtl .

can thus be described as:

SSS PPP _:dls .
_:dls rdf:value "..." .  # implies a structured value specialization? Use rdf:lexicalForm?
_:dls rdf:language "und"^^rdf:langTag .  # cf. dct:ISO639-1 | dct:ISO639-2
_:dls rdf:direction "rtl"^^rdf:langDir .  # or rather rdf:RTL ?

Note that the following is D-entailed:

_:dls a rdf:dirLangString .

Harmonize this pattern with rdf:CompoundLiteral possibly used for basic-encoding.

Extended Directional Language String Interpretation

An extended directional language string interpretation is an RDFS interpretation I which also satisfies the following semantic condition, and all the triples in the subsequent table of axiomatic triples.

Directional language string semantic condition.
If IL(E) is in ICEXT(I(`rdf:dirLangString`)) and IL(E) = <v, l, d> where v is the lexical form, l is the language tag in lower-case, and d is the direction of E, then: <IL(E), IL(v)> is in IEXT(I(`rdf:value`)), <IL(E), IL(l)> is in IEXT(I(`rdf:language`)), and <IL(E), IL(d)> is in IEXT(I(`rdf:direction`)). and: IL(v) is in ICEXT(I(`xsd:string`)), IL(l) is in ICEXT(I(`rdf:langTag`)), and IL(d) is in ICEXT(I(`rdf:langDir`)).

Axiomatic triples of directional language string forms.
rdf:langString rdf:type rdfs:Datatype . rdf:dirLangString rdf:type rdfs:Datatype . rdf:langTag rdf:type rdfs:Datatype . rdf:langDir rdf:type rdfs:Datatype . rdf:language rdf:type rdf:Property . rdf:language rdfs:domain [owl:unionOf (rdf:langString rdf:dirLangString)] . rdf:language rdfs:range rdf:langTag . rdf:direction rdf:type rdf:Property . rdf:direction rdfs:domain rdf:dirLangString . rdf:direction rdfs:range rdf:langDir .

Axiomatic triples of directional language string forms.

  rdf:langString rdf:type rdfs:Datatype .
  rdf:dirLangString rdf:type rdfs:Datatype .
  rdf:langTag rdf:type rdfs:Datatype .
  rdf:langDir rdf:type rdfs:Datatype .

  rdf:language rdf:type rdf:Property .
  rdf:language rdfs:domain [owl:unionOf (rdf:langString rdf:dirLangString)] .
  rdf:language rdfs:range rdf:langTag .

  rdf:direction rdf:type rdf:Property .
  rdf:direction rdfs:domain rdf:dirLangString .
  rdf:direction rdfs:range rdf:langDir .

Language Strings

The language string pattern:

SSS PPP "..."@und .

can be described as:

SSS PPP _:dls .
_:dls a rdf:langString .
_:dls rdf:value "..." .  # Use rdf:lexicalForm?
_:dls rdf:language "und"^^rdf:langTag .

Extended Language String Interpretation

The definition of a semantic condition for rdf:langString is similar to the semantic condition of rdf:dirLangString, but without the rdf:direction part:

Language string semantic condition.
If IL(E) is in ICEXT(I(`rdf:langString`)) and IL(E) = <v, l> where v is the lexical form, and l is the language tag in lower-case of E, then: <IL(E), IL(v)> is in IEXT(I(`rdf:value`)), and <IL(E), IL(l)> is in IEXT(I(`rdf:language`)). and: IL(v) is in ICEXT(I(`xsd:string`)), and IL(l) is in ICEXT(I(`rdf:langTag`)).

Other Datatypes

In principle, any datatyped literals encode resources with intrinsic meaning, and that this meaning can be formally specified.

It is therefore generally possible to define semantic extensions for values of any formally defined datatype, entailing relationships between literal values and their constituent parts.

For example, a literal xsd:date value:

SSS PPP "YYYY-MM-DD"^^xsd:date .

implies a resource with one year, one month and one day (here expressed using the [[[OWL-TIME]]]):

SSS PPP _:dt a xsd:date .
_:dt time:year "YYYY"^^xsd:gYear .
_:dt time:month "--MM"^^xsd:gMonth .
_:dt time:day "---DD"^^xsd:gDay .

Defining semantic conditions for such structures, and the formal properties thereof, is beyond the scope of this document (and outside the purview of RDF for any datatype defined by other specifications). A possible SPARQL-based expansion form is proposed in Appendix [[[#sparql-expansion-datetime]]].

Define a general semantic condition for rdfs:Literals entailing a rdf:lexicalForm relation to its lexical value? Cf. L2V(d)(v).

OWL-based Interoperability Patterns

Using the extended interpretations defined above results in a graph which can be further reasoned over through the semantic extension of OWL. Axioms can be defined to describe restriction subclasses of rdfs:Proposition based on, for example, a particular predicate, and to define sub-property chain axioms of relationships that hold between entities in more complex relationships. This allows for annotations of direct, more abstract statements to entail underlying, more granular forms of association classes.

The patterns below utilize owl:hasSelf to define a rolification of a class, meaning a restriction pattern for the relationship from an individual to itself, which is equivalent to a specified class. This restriction implies that an instance of such a class has this rolified relationship to itself (and vice versa). That enables the use of the rolification property in sub-property chain axioms, functioning as a self-relation chain component matching the type of the resource.

Interoperability Patterns Using Reified Propositions

The following patterns rely on OWL 2 and extended interpretation of propositions to ensure that simplified descriptions leveraging reifying triples and/or triple annotations can be interoperably interpreted as classical, more complex forms of descriptions.

Classic RDF Reification as Reifiers

Given these axioms:

this graph:

entails this graph:

PROV-O Qualifications as Reifiers

In the PROV Ontology ([[PROV-O]]) several association classes have been defined as qualified counterparts for a number of simple relationships. In this section, two examples are examined (borrowed from examples 8 and 9 of [[PROV-O]]). The following prefix is assumed in the subsequent examples:

PREFIX prov: <http://www.w3.org/ns/prov#>

These two simple relationships:

have the following companion qualified forms:

Using OWL, there is a way to entail the simple relationships from the qualified forms:

With the extended interpretation of propositions, it is possible to conversely entail the qualified forms from annotated versions of the simple relationships:

This can be done by extending the ontology itself, by defining property chains between reifiers of type prov:Usage and the corresponding subject and object of a proposition with the prov:used predicate:

and similarly for reifiers of type prov:Association of triples with the prov:wasAssociatedWith predicate:

Appendices

Relationship to Basic-encoding

The semantic conditions specified here entail structures that are structurally based on, but formally different from—or complementary to—the interoperability patterns between RDF 1.2 Full, RDF 1.2 Basic, and RDF 1.1 defined by the algorithms of basic encoding and decoding.

The inherent structures of propositions and datatype properties are possible to query over, either using SPARQL 1.2, or using SPARQL 1.1 along with basic encoding under simple entailment. But it is not possible to reason over them using RDF semantics alone. This prevents the definition of OWL-based axioms relating to this inherent structure. By defining a set of semantic extensions, this document bridges that gap.

These two methods are complementary ways of working with RDF 1.2 data. Basic-encoding is for RDF 1.1 or 1.2 Basic systems that lack support for triple terms. Extended interpretations of formal structures is for systems leveraging semantic extensions based on RDF 1.2, such as OWL 2, in order to be able to reason over the intrinsic properties of propositions and literal values.

Refer to the specification of basic-encoding. In this document, Appendix [[[#section-basicenc-sparql]]] defines SPARQL-based versions of basic encoding and decoding for propositions and directional language strings.

Conditions for Basic-encoding

Basic-encoding and subsequent decoding is only safe under simple entailment.

Since literal values and propositions are resources, RDF entailment implies they have an rdf:type (of rdfs:Literal and rdfs:Proposition, respectively). Any encoding and decoding must take place on a graph where no entailments have been manifested.

While the RDF abstract syntax prevents this by disallowing the special encoding terms denoting these kinds of resources as subjects, the same is not possible if facts about them are encoded as triples with IRIs or blank nodes as subject identifiers. To avoid the direct logical description of these resources, basic encoding uses underlying base types for describing propositions and directional language strings, simply encoding their forms.

Note that due to the rules of basic-encoding, it is impossible to basic-decode a graph representing the formal entailments about propositions (since their type is rdfs:Proposition). This is by design, to avoid basic-encoding data using anything beyond simple entailment. Entailed facts should not be put back into an RDF Source. They are simply true, following from the known axioms and conditions of a given entailment regime.

Upgrading to Structural Interpretations

Since applications may start with a dependency on deployed structures, and rely on basic-encoding, having a formal way of instead entailing these facts ensures a seamless upgrade path, without requiring redefined domain logic or queries rewritten for new structures.

Similarly, a basic-encoded graph will work mostly the same when reasoned over using OWL. One crucial difference is that there will be no triple terms in a basic-encoded graph, meaning that there is no owl:sameAs relationship to such terms from the encoded propositions. This is not significant for the meaning, but for implementations also relying on the graph structure itself.

Basic-encoding using SPARQL

The following subsections define SPARQL updates for basic-encoding and -decoding. They have to be repeated until the triple count is stable (i.e. iterate while the triple count increases (for encoding) or decreases (for decoding); and halting when the triple count is equal to the one of the previous iteration).

Since datatyped literals and language-tagged strings have been part of RDF since version 1.0, there is no need for basic encoding such structures.

Introduction

Namespaces

Notation and Terminology