Galaxy Consulting Blog: Ontology

Showing posts with label Ontology. Show all posts

Sunday, March 13, 2016

What is Ontology?

Ontology is a formal naming and definition of the types, properties, and interrelationships of the entities that exist in a particular domain of information. Ontologies are created to limit complexity and to organize information. Ontologies are considered one of the pillars of the Semantic Web.

The term ontology has its origin in philosophy and has been applied in many different ways. The word element onto- comes from the Greek "being", "that which is". The meaning within information management is a model for describing information that consists of a set of types, properties, and relationship types. Ontologies share many structural similarities, regardless of the language in which they are expressed. Most ontologies describe individuals (instances), classes (concepts), attributes, and relations.

The most common ontology visualization techniques are indented tree and graph.

Ontology Components

Common components of ontologies include:

Individuals: instances or objects (the basic or "ground level" objects).
Classes: sets, collections, concepts, classes in programming, types of objects, or kinds of things.
Attributes: aspects, properties, features, characteristics, or parameters that objects and classes can have.
Relations: ways in which classes and individuals can be related to one another.
Function terms: complex structures formed from certain relations that can be used in place of an individual term in a statement.
Restrictions: formally stated descriptions of what must be true in order for some assertion to be accepted as input.
Rules: statements in the form of an if-then (antecedent-consequent) sentence that describe the logical inferences that can be drawn from an assertion in a particular form.
Axioms: assertions (including rules) in a logical form that together comprise the overall theory that the ontology describes in its domain of application.
Events: the changing of attributes or relations.

Ontologies are commonly encoded using ontology languages.

Ontology Types

Domain Ontology

A domain ontology (or domain-specific ontology) represents concepts which belong to a certain term. Particular meanings of terms applied to that domain are provided by domain ontology. For example, the word "card" has many different meanings. An ontology about the domain of "poker" would model the "playing card" meaning of the word.

Since domain ontologies represent concepts in very specific and often eclectic ways, they are often incompatible. As systems that rely on domain ontologies expand, the need comes to merge domain ontologies into a more general representation. Different ontologies in the same domain arise due to different languages, different intended usage of the ontologies, and different perceptions of the domain (based on cultural background, education, ideology, etc.).

Upper Ontology

An upper ontology (or foundation ontology) is a model of the common objects that are generally applicable across a wide range of domain ontologies. It usually employs a core glossary that contains the terms and associated object descriptions as they are used in various relevant domain sets.

There are several standardized upper ontologies available for use such as Dublin Core, for example.

Hybrid Ontology

Hybrid ontology is a combination of upper and domain ontology.

Ontology Languages

Ontology languages are formal languages used to construct ontologies. They allow the encoding of knowledge about specific domains and often include reasoning rules that support the processing of that knowledge. The most commonly used ontology languages are Web Ontology Language (OWL), Resource Description Framework (RDF), RDF Schema (RDFS), Ontology Inference Layer (OIL).

Ontology Editors

Ontology editors are applications designed to assist in the creation or manipulation of ontologies. They often express ontologies in one of many ontology languages. Some provide export to other ontology languages.

Among the most relevant criteria for choosing an ontology editor are the degree to which the editor abstracts from the actual ontology representation language used for persistence and the visual navigation possibilities within the knowledge model. Also important features are built-in inference engines and information extraction facilities, and the support of meta-ontologies such as OWL-S, Dublin Core, etc. Another important feature is the ability to import & export foreign knowledge representation languages for ontology matching. Ontologies are developed for a specific purpose and application.

Ontology Learning

Ontology learning is the automatic or semi-automatic creation of ontologies, including extracting a domain's terms from natural language text. As building ontologies manually is labor-intensive and time consuming process, there is a need to automate the process. Information extraction and text mining methods have been explored to automatically link ontologies to documents.

Galaxy Consulting has 16 years experience working with ontologies. Please contact us for a free consultation.

Friday, July 31, 2015

Dublin Core Metadata Applications - Web Ontology Language (OWL)

In my last post, I described one of the most used applications of Dublin Core Metadata - RDF. In today's post, I will describe second most used applications of Dublin Core Metadata - Web Ontology Language (OWL).

The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontology. Ontology is a formal way to describe taxonomy and classification networks, essentially defining the structure of knowledge for various domains: the nouns represent classes of objects and the verbs represent relations between the objects.

An ontology defines the terms used to describe and represent an area of knowledge. Ontologies are used by people, databases, and applications that need to share domain information (a domain is just a specific subject area or area of knowledge, like medicine, tool manufacturing, real estate, automobile repair, financial management, etc.). Ontologies include computer-usable definitions of basic concepts in the domain and the relationships among them. They encode knowledge in a domain and also knowledge that spans domains. In this way, they make that knowledge reusable.

Ontology resembles class hierarchies. It is meant to represent information on the Internet and are expected to be evolving almost constantly. Ontologies are typically very flexible as they are coming from all sorts of data sources.

The OWL languages are characterized by formal semantics. They are built upon a W3C XML standard for RDF objects. I described RDF in my previous post.

The data described by an ontology in the OWL family is interpreted as a set of "individuals" and a set of "property assertions" which relate these individuals to each other. An ontology consists of a set of axioms which place constraints on sets of individuals (called "classes") and the types of relationships permitted between them. These axioms provide semantics by allowing systems to infer additional information based on the data explicitly provided.

OWL ontologies can import other ontologies, adding information from the imported ontology to the current ontology.

For example: an ontology describing families might include axioms stating that a "hasMother" property is only present between two individuals when "hasParent" is also present, and individuals of class "HasTypeOBlood" are never related via "hasParent" to members of "HasTypeABBlood" class. If it is stated that the individual Harriet is related via "hasMother" to the individual Sue, and that Harriet is a member of the "HasTypeOBlood" class, then it can be inferred that Sue is not a member of "HasTypeABBlood".

The W3C-endorsed OWL specification includes the definition of three variants of OWL, with different levels of expressiveness. These are OWL Lite, OWL DL and OWL Full

OWL Lite

OWL Lite was originally intended to support those users primarily needing a classification hierarchy and simple constraints. It is not widely used.

OWL DL

OWL DL includes all OWL language constructs, but they can be used only under certain restrictions (for example, number restrictions may not be placed upon properties which are declared to be transitive). OWL DL is so named due to its correspondence with description logic, a field of research that has studied the logics that form the formal foundation of OWL.

OWL Full

OWL Full is based on a different semantics from OWL Lite or OWL DL, and was designed to preserve some compatibility with RDF Schema. For example, in OWL Full a class can be treated simultaneously as a collection of individuals and as an individual in its own right; this is not permitted in OWL DL. OWL Full allows an ontology to augment the meaning of the pre-defined (RDF or OWL) vocabulary.

OWL Full is intended to be compatible with RDF Schema (RDFS), and to be capable of augmenting the meanings of existing Resource Description Framework (RDF) vocabulary. This interpretation provides the meaning of RDF and RDFS vocabulary. So, the meaning of OWL Full ontologies are defined by extension of the RDFS meaning, and OWL Full is a semantic extension of RDF.

Every OWL ontology must be identified by an URI. For example: Ontology(). The languages in the OWL family use the open world assumption. Under the open world assumption, if a statement cannot be proven to be true with current knowledge, we cannot draw the conclusion that the statement is false.

Languages in the OWL family are capable of creating classes, properties, defining instances and its operations.

Instances

An instance is an object. It corresponds to a description logic individual.

Classes

A class is a collection of objects. It corresponds to a description logic (DL) concept. A class may contain individuals, instances of the class. A class may have any number of instances. An instance may belong to none, one or more classes. A class may be a subclass of another, inheriting characteristics from its parent superclass.

Class and their members can be defined in OWL either by extension or by intension. An individual can be explicitly assigned a class by a Class assertion, for example we can add a statement Queen Elizabeth is a(an instance of) human, or by a class expression with ClassExpression statements of every instance of the human class who has a female value to is an instance of the woman class.

Properties

A property is a directed binary relation that specifies class characteristics. It corresponds to a description logic role. They are attributes of instances and sometimes act as data values or link to other instances. Properties may possess logical capabilities such as being transitive, symmetric, inverse and functional. Properties may also have domains and ranges.

Datatype Properties

Datatype properties are relations between instances of classes and RDF literals or XML schema datatypes. For example, modelName (String datatype) is the property of Manufacturer class. They are formulated using owl:DatatypeProperty type.

Object Properties

Object properties are relations between instances of two classes. For example, ownedBy may be an object type property of the Vehicle class and may have a range which is the class Person. They are formulated using owl:ObjectProperty.

Operators

Languages in the OWL family support various operations on classes such as union, intersection and complement. They also allow class enumeration, cardinality, and disjointness.

Metaclasses

Metaclasses are classes of classes. They are allowed in OWL full or with a feature called class/instance punning.

Syntax

The OWL family of languages supports a variety of syntaxes. It is useful to distinguish high level syntaxes aimed at specification from exchange syntaxes more suitable for general use.

High Level

These are close to the ontology structure of languages in the OWL family.

OWL Abstract Syntax

This high level syntax is used to specify the OWL ontology structure and semantics.

The OWL abstract syntax presents an ontology as a sequence of annotations, axioms and facts. Annotations carry machine and human oriented metadata. Information about the classes, properties and individuals that compose the ontology is contained in axioms and facts only. Each class, property and individual is either anonymous or identified by an URI reference. Facts state data either about an individual or about a pair of individual identifiers (that the objects identified are distinct or the same). Axioms specify the characteristics of classes and properties.

OWL2 Functional Syntax

This syntax closely follows the structure of an OWL2 ontology. It is used by OWL2 to specify semantics, mappings to exchange syntaxes and profiles

OWL2 XML Syntax

OWL2 specifies an XML serialization that closely models the structure of an OWL2 ontology.

Manchester Syntax

The Manchester Syntax is a compact, human readable syntax with a style close to frame languages. Variations are available for OWL and OWL2. Not all OWL and OWL2 ontologies can be expressed in this syntax.

OWL is playing an important role in an increasing number and range of applications, and is the focus of research into tools, reasoning techniques, formal foundations and language extensions.

Wednesday, July 8, 2015

Dublin Core Metadata Applications - RDF

The Dublin Core Schema is a small set of vocabulary terms that can be used to describe different resources.

Dublin Core Metadata may be used for multiple purposes, from simple resource description, to combining metadata vocabularies of different metadata standards, to providing inter-operability for metadata vocabularies in the Linked data cloud and Semantic web implementations.

Most used applications of Dublin Core Metadata are RDF and OWL. I will describe OWL in my next post.

RDF stands for Resource Description Framework. It is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed.

RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link (this is usually referred to as a “triple”). Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications.

This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes. This graph view is the easiest possible mental model for RDF and is often used in easy-to-understand visual explanations.

RDF Schema or RDFS is a set of classes with certain properties using the RDF extensible knowledge representation data model, providing basic elements for the description of ontologies, otherwise called RDF vocabularies, intended to structure RDF resources. These resources can be saved in a triplestore to reach them with the query language SPARQL.

The first version RDFS version was published by the World-Wide Web Consortium (W3C) in April 1998, and the final W3C recommendation was released in February 2004. Many RDFS components are included in the more expressive Web Ontology Language (OWL).

Main RDFS constructs

RDFS constructs are the RDFS classes, associated properties, and utility properties built on the limited vocabulary of RDF.

Classes

Resource is the class of everything. All things described by RDF are resources.

Class declares a resource as a class for other resources.

A typical example of a Class is "Person" in the Friend of a Friend (FOAF) vocabulary. An instance of "Person" is a resource that is linked to the class "Person" using the type property, such as in the following formal expression of the natural language sentence: "John is a Person".

example: John rdf:type foaf:Person

The other classes described by the RDF and RDFS specifications are:

Literal – literal values such as strings and integers. Property values such as textual strings are examples of literals. Literals may be plain or typed.
Datatype – the class of datatypes. Datatype is both an instance of and a subclass of Class. Each instance of:Datatype is a subclass of Literal.
XMLLiteral – the class of XML literal values.XMLLiteral is an instance of Datatype (and thus a subclass of Literal).
Property – the class of properties.

Properties

Properties are instances of the class Property and describe a relation between subject resources and object resources.

For example, the following declarations are used to express that the property "employer" relates a subject, which is of type "Person", to an object, which is of type "Organization":

ex:employer rdfs:domain foaf:Person

ex:employer rdfs:range foaf:Organization

Hierarchies of classes support inheritance of a property domain and range from a class to its sub-classes:

subPropertyOf is an instance of Property that is used to state that all resources related by one property are also related by another.
Label is an instance of Property that may be used to provide a human-readable version of a resource's name.
Comment is an instance of Property that may be used to provide a human-readable description of a resource.

Utility properties

seeAlso is an instance of Property that is used to indicate a resource that might provide additional information about the subject resource.

isDefinedBy is an instance of Property that is used to indicate a resource defining the subject resource. This property may be used to indicate an RDF vocabulary in which a resource is described.

Pages