Galaxy Consulting Blog: Dublin Core Metadata

Showing posts with label Dublin Core Metadata. Show all posts

Friday, July 31, 2015

Dublin Core Metadata Applications - Web Ontology Language (OWL)

In my last post, I described one of the most used applications of Dublin Core Metadata - RDF. In today's post, I will describe second most used applications of Dublin Core Metadata - Web Ontology Language (OWL).

The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontology. Ontology is a formal way to describe taxonomy and classification networks, essentially defining the structure of knowledge for various domains: the nouns represent classes of objects and the verbs represent relations between the objects.

An ontology defines the terms used to describe and represent an area of knowledge. Ontologies are used by people, databases, and applications that need to share domain information (a domain is just a specific subject area or area of knowledge, like medicine, tool manufacturing, real estate, automobile repair, financial management, etc.). Ontologies include computer-usable definitions of basic concepts in the domain and the relationships among them. They encode knowledge in a domain and also knowledge that spans domains. In this way, they make that knowledge reusable.

Ontology resembles class hierarchies. It is meant to represent information on the Internet and are expected to be evolving almost constantly. Ontologies are typically very flexible as they are coming from all sorts of data sources.

The OWL languages are characterized by formal semantics. They are built upon a W3C XML standard for RDF objects. I described RDF in my previous post.

The data described by an ontology in the OWL family is interpreted as a set of "individuals" and a set of "property assertions" which relate these individuals to each other. An ontology consists of a set of axioms which place constraints on sets of individuals (called "classes") and the types of relationships permitted between them. These axioms provide semantics by allowing systems to infer additional information based on the data explicitly provided.

OWL ontologies can import other ontologies, adding information from the imported ontology to the current ontology.

For example: an ontology describing families might include axioms stating that a "hasMother" property is only present between two individuals when "hasParent" is also present, and individuals of class "HasTypeOBlood" are never related via "hasParent" to members of "HasTypeABBlood" class. If it is stated that the individual Harriet is related via "hasMother" to the individual Sue, and that Harriet is a member of the "HasTypeOBlood" class, then it can be inferred that Sue is not a member of "HasTypeABBlood".

The W3C-endorsed OWL specification includes the definition of three variants of OWL, with different levels of expressiveness. These are OWL Lite, OWL DL and OWL Full

OWL Lite

OWL Lite was originally intended to support those users primarily needing a classification hierarchy and simple constraints. It is not widely used.

OWL DL

OWL DL includes all OWL language constructs, but they can be used only under certain restrictions (for example, number restrictions may not be placed upon properties which are declared to be transitive). OWL DL is so named due to its correspondence with description logic, a field of research that has studied the logics that form the formal foundation of OWL.

OWL Full

OWL Full is based on a different semantics from OWL Lite or OWL DL, and was designed to preserve some compatibility with RDF Schema. For example, in OWL Full a class can be treated simultaneously as a collection of individuals and as an individual in its own right; this is not permitted in OWL DL. OWL Full allows an ontology to augment the meaning of the pre-defined (RDF or OWL) vocabulary.

OWL Full is intended to be compatible with RDF Schema (RDFS), and to be capable of augmenting the meanings of existing Resource Description Framework (RDF) vocabulary. This interpretation provides the meaning of RDF and RDFS vocabulary. So, the meaning of OWL Full ontologies are defined by extension of the RDFS meaning, and OWL Full is a semantic extension of RDF.

Every OWL ontology must be identified by an URI. For example: Ontology(). The languages in the OWL family use the open world assumption. Under the open world assumption, if a statement cannot be proven to be true with current knowledge, we cannot draw the conclusion that the statement is false.

Languages in the OWL family are capable of creating classes, properties, defining instances and its operations.

Instances

An instance is an object. It corresponds to a description logic individual.

Classes

A class is a collection of objects. It corresponds to a description logic (DL) concept. A class may contain individuals, instances of the class. A class may have any number of instances. An instance may belong to none, one or more classes. A class may be a subclass of another, inheriting characteristics from its parent superclass.

Class and their members can be defined in OWL either by extension or by intension. An individual can be explicitly assigned a class by a Class assertion, for example we can add a statement Queen Elizabeth is a(an instance of) human, or by a class expression with ClassExpression statements of every instance of the human class who has a female value to is an instance of the woman class.

Properties

A property is a directed binary relation that specifies class characteristics. It corresponds to a description logic role. They are attributes of instances and sometimes act as data values or link to other instances. Properties may possess logical capabilities such as being transitive, symmetric, inverse and functional. Properties may also have domains and ranges.

Datatype Properties

Datatype properties are relations between instances of classes and RDF literals or XML schema datatypes. For example, modelName (String datatype) is the property of Manufacturer class. They are formulated using owl:DatatypeProperty type.

Object Properties

Object properties are relations between instances of two classes. For example, ownedBy may be an object type property of the Vehicle class and may have a range which is the class Person. They are formulated using owl:ObjectProperty.

Operators

Languages in the OWL family support various operations on classes such as union, intersection and complement. They also allow class enumeration, cardinality, and disjointness.

Metaclasses

Metaclasses are classes of classes. They are allowed in OWL full or with a feature called class/instance punning.

Syntax

The OWL family of languages supports a variety of syntaxes. It is useful to distinguish high level syntaxes aimed at specification from exchange syntaxes more suitable for general use.

High Level

These are close to the ontology structure of languages in the OWL family.

OWL Abstract Syntax

This high level syntax is used to specify the OWL ontology structure and semantics.

The OWL abstract syntax presents an ontology as a sequence of annotations, axioms and facts. Annotations carry machine and human oriented metadata. Information about the classes, properties and individuals that compose the ontology is contained in axioms and facts only. Each class, property and individual is either anonymous or identified by an URI reference. Facts state data either about an individual or about a pair of individual identifiers (that the objects identified are distinct or the same). Axioms specify the characteristics of classes and properties.

OWL2 Functional Syntax

This syntax closely follows the structure of an OWL2 ontology. It is used by OWL2 to specify semantics, mappings to exchange syntaxes and profiles

OWL2 XML Syntax

OWL2 specifies an XML serialization that closely models the structure of an OWL2 ontology.

Manchester Syntax

The Manchester Syntax is a compact, human readable syntax with a style close to frame languages. Variations are available for OWL and OWL2. Not all OWL and OWL2 ontologies can be expressed in this syntax.

OWL is playing an important role in an increasing number and range of applications, and is the focus of research into tools, reasoning techniques, formal foundations and language extensions.

Wednesday, July 8, 2015

Dublin Core Metadata Applications - RDF

The Dublin Core Schema is a small set of vocabulary terms that can be used to describe different resources.

Dublin Core Metadata may be used for multiple purposes, from simple resource description, to combining metadata vocabularies of different metadata standards, to providing inter-operability for metadata vocabularies in the Linked data cloud and Semantic web implementations.

Most used applications of Dublin Core Metadata are RDF and OWL. I will describe OWL in my next post.

RDF stands for Resource Description Framework. It is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed.

RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link (this is usually referred to as a “triple”). Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications.

This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes. This graph view is the easiest possible mental model for RDF and is often used in easy-to-understand visual explanations.

RDF Schema or RDFS is a set of classes with certain properties using the RDF extensible knowledge representation data model, providing basic elements for the description of ontologies, otherwise called RDF vocabularies, intended to structure RDF resources. These resources can be saved in a triplestore to reach them with the query language SPARQL.

The first version RDFS version was published by the World-Wide Web Consortium (W3C) in April 1998, and the final W3C recommendation was released in February 2004. Many RDFS components are included in the more expressive Web Ontology Language (OWL).

Main RDFS constructs

RDFS constructs are the RDFS classes, associated properties, and utility properties built on the limited vocabulary of RDF.

Classes

Resource is the class of everything. All things described by RDF are resources.

Class declares a resource as a class for other resources.

A typical example of a Class is "Person" in the Friend of a Friend (FOAF) vocabulary. An instance of "Person" is a resource that is linked to the class "Person" using the type property, such as in the following formal expression of the natural language sentence: "John is a Person".

example: John rdf:type foaf:Person

The other classes described by the RDF and RDFS specifications are:

Literal – literal values such as strings and integers. Property values such as textual strings are examples of literals. Literals may be plain or typed.
Datatype – the class of datatypes. Datatype is both an instance of and a subclass of Class. Each instance of:Datatype is a subclass of Literal.
XMLLiteral – the class of XML literal values.XMLLiteral is an instance of Datatype (and thus a subclass of Literal).
Property – the class of properties.

Properties

Properties are instances of the class Property and describe a relation between subject resources and object resources.

For example, the following declarations are used to express that the property "employer" relates a subject, which is of type "Person", to an object, which is of type "Organization":

ex:employer rdfs:domain foaf:Person

ex:employer rdfs:range foaf:Organization

Hierarchies of classes support inheritance of a property domain and range from a class to its sub-classes:

subPropertyOf is an instance of Property that is used to state that all resources related by one property are also related by another.
Label is an instance of Property that may be used to provide a human-readable version of a resource's name.
Comment is an instance of Property that may be used to provide a human-readable description of a resource.

Utility properties

seeAlso is an instance of Property that is used to indicate a resource that might provide additional information about the subject resource.

isDefinedBy is an instance of Property that is used to indicate a resource defining the subject resource. This property may be used to indicate an RDF vocabulary in which a resource is described.

Wednesday, August 13, 2014

Dublin Core Metadata

The word "metadata" means "data about data". Metadata describes a context for objects of interest such as document files, images, audio and video files. It can also be called resource description. As a tradition, resource description dates back to the earliest archives and library catalogs. The modern "metadata" field that gave rise to Dublin Core and other recent standards emerged with the Web revolution of the mid-1990s.

The Dublin Core Schema is a small set of vocabulary terms that can be used to describe different resources.

"Dublin" refers to Dublin, Ohio, USA where the schema originated during the 1995 invitational OCLC/NCSA Metadata Workshop hosted by the Online Computer Library Center (OCLC), a library consortium based in Dublin, and the National Center for Supercomputing Applications (NCSA). "Core" refers to the metadata terms as broad and generic being usable for describing a wide range of resources. The semantics of Dublin Core were established and are maintained by an international, cross-disciplinary group of professionals from librarianship, computer science, text encoding, museums, and other related fields of scholarship and practice.

The Dublin Core Metadata Initiative (DCMI) provides an open forum for the development of inter-operable online metadata standards for a broad range of purposes and of business models. DCMI's activities include consensus-driven working groups, global conferences and workshops, standards liaison, and educational efforts to promote widespread acceptance of metadata standards and practices.

In 2008, DCMI separated from OCLC and incorporated as an independent entity. Any and all changes that are made to the Dublin Core standard are reviewed by a DCMI Usage Board within the context of a DCMI Namespace Policy. This policy describes how terms are assigned and also sets limits on the amount of editorial changes allowed to the labels, definitions, and usage comments.

Levels of the Standard

The Dublin Core standard originally includes two levels: Simple and Qualified. Simple Dublin Core comprised 15 elements; Qualified Dublin Core included three additional elements (Audience, Provenance and RightsHolder), as well as a group of element refinements (also called qualifiers) that could refine the semantics of the elements in ways that may be useful in resource discovery. Since 2012 the two have been incorporated into the DCMI Metadata Terms as a single set of terms using the Resource Description Framework (RDF).

The original Dublin Core Metadata Element Set which is the Simple level consists of 15 metadata elements:

Title

Creator

Subject

Description

Publisher

Contributor

Date

Type

Format

Identifier

Source

Language

Relation

Coverage

Rights

Each Dublin Core element is optional and may be repeated. The DCMI has established standard ways to refine elements and encourage the use of encoding and vocabulary schemes. There is no prescribed order in Dublin Core for presenting or using the elements. The Dublin Core became ISO 15836 standard in 2006 and is used as a base-level data element set for the description of learning resources in the ISO/IEC 19788-2.

Qualified Dublin Core

Subsequent to the specification of the original 15 elements, an ongoing process to develop terms extending or refining the Dublin Core Metadata Element Set (DCMES) began. The additional terms were identified. Elements refinements make the meaning of an element narrower or more specific. A refined element shares the meaning of the unqualified element, but with a more restricted scope.

In addition to element refinements, Qualified Dublin Core includes a set of recommended encoding schemes, designed to aid in the interpretation of an element value. These schemes include controlled vocabularies and formal notations or parsing rules.

Syntax

Syntax choices for Dublin Core metadata depends on a number of variables, and "one size fits all" forms rarely apply. When considering an appropriate syntax, it is important to note that Dublin Core concepts and semantics are designed to be syntax independent and are equally applicable in a variety of contexts, as long as the metadata is in a form suitable for interpretation both by machines and by human beings.

The Dublin Core Abstract Model provides a reference model against which particular Dublin Core encoding guidelines can be compared, independent of any particular encoding syntax. Such a reference model allows users to gain a better understanding of descriptions they are trying to encode and facilitates the development of better mappings and translations between different syntax.

I will describe some applications of Dublin Core in my future posts.

Pages