Capturing Metadata and Relationships with RDF
The RDF Model for Application Graphs
RDF provides a machine-readable framework for specifying — in digital library terminology, metadata. This metadata can be used to organize and manage information.
An RDF statement is an arc with a subject that denotes a resource, a predicate that describes characteristics of the resource, and an object, which can be either a URI or a literal value.
The Graph Model
When you design your application graph model, you want it to be expressive of your domain and easy to query on behalf of your most important use cases. You also want to make it easy to evolve the model and queries on a case-by-case basis, as new use cases emerge and your knowledge of the domain grows.
A central component of this goal is ensuring that your application graph data model reflects the underlying semantics and structures of your queries. Generally, this involves using graph primitives – vertices, labels and properties in the case of a property graph and subject-predicate-object triples in the case of RDF – to represent the graph patterns expressed in your queries.
This often results in a hub-and-spoke structure, where a central vertex representing a fact or event is connected to several neighbouring vertices that provide context about that fact or event. For example, a Purchase hub vertex would be connected to the User who made the purchase and the several Product items purchased.
The Semantic Model
The goal of a semantic model is to capture meaning and relationships within a data structure. This contrasts with a traditional data model that is based upon tables, rows and columns.
In a business context, a semantic layer acts as a map between an organization’s current data and the future data it wants to handle. Typically, the semantic layer enables users to use a common business language to answer questions and solve problems.
Semantic models can be very simple or complex, depending on the needs of an organization and its users. For example, a basic online thesaurus portal can simply organize synonyms and related terms into groups.
A more advanced semantic model can also connect classes and their properties to form an ontology. This can then be used to identify and manage relationships between objects. For example, an ontology could be used to automatically link a staff member’s age and date of birth to their occupation, thereby helping recruitment teams to identify relevant applicants and hire the right people.
The Data Model
The RDF data model is based on a simple, flexible, and extensible information architecture. It defines a syntax (RDF XML) that is used to store and communicate metadata, and it imposes structural constraints on XML to support the consistent representation of semantics.
Its basic structure is a directed graph of triples. Each triple consists of a subject, a predicate, and an object. The subjects are identified by a Uniform Resource Identifier (URI), the predicates by a property-type definition, and the objects by literal values.
The data model also provides for a mechanism to restrict the meaning of a literal, enabling it to be used for different purposes in different domains. It also supports the use of a number of existing human-readable and machine-readable vocabularies, to allow metadata to be defined in multiple languages. It also introduces a set of statement-level annotations, which are more efficient than traditional key-value pairs and enable the representation of recurrent events, data provenance, and other attribute-rich data.
The Usage Model
RDF has a special syntax that describes data graphs, much like the XML syntax. It is a labeled-directed graph, which means that each node has some type of label and arcs only go in one direction. The identifiers used to refer to each item of metadata in the graph are called uniform resource identifiers (URI).
A Resource in an RDF model is described by three pieces of information, called a triple: a subject, a predicate and an object. The subject is what the description is about, the predicate describes characteristics of the subject, and the object is what it relates to.
An RDF description of a person may include a person’s age and birthday. This is because of the use of a standard vocabulary for the description of people and their relationships, which is defined by a community of people who want to describe such metadata.
…