Wednesday, July 8, 2015

Dublin Core Metadata Applications - RDF

The Dublin Core Schema is a small set of vocabulary terms that can be used to describe different resources.

Dublin Core Metadata may be used for multiple purposes, from simple resource description, to combining metadata vocabularies of different metadata standards, to providing inter-operability for metadata vocabularies in the Linked data cloud and Semantic web implementations.

Most used applications of Dublin Core Metadata are RDF and OWL. I will describe OWL in my next post.

RDF stands for Resource Description Framework. It is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed.

RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link (this is usually referred to as a “triple”). Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications.

This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes. This graph view is the easiest possible mental model for RDF and is often used in easy-to-understand visual explanations.

RDF Schema or RDFS is a set of classes with certain properties using the RDF extensible knowledge representation data model, providing basic elements for the description of ontologies, otherwise called RDF vocabularies, intended to structure RDF resources. These resources can be saved in a triplestore to reach them with the query language SPARQL.

The first version RDFS version was published by the World-Wide Web Consortium (W3C) in April 1998, and the final W3C recommendation was released in February 2004. Many RDFS components are included in the more expressive Web Ontology Language (OWL).

Main RDFS constructs

RDFS constructs are the RDFS classes, associated properties, and utility properties built on the limited vocabulary of RDF.

Classes

Resource is the class of everything. All things described by RDF are resources.
Class declares a resource as a class for other resources.

A typical example of a Class is "Person" in the Friend of a Friend (FOAF) vocabulary. An instance of "Person" is a resource that is linked to the class "Person" using the type property, such as in the following formal expression of the natural language sentence: "John is a Person".

example: John rdf:type foaf:Person

The other classes described by the RDF and RDFS specifications are:
  • Literal – literal values such as strings and integers. Property values such as textual strings are examples of literals. Literals may be plain or typed.
  • Datatype – the class of datatypes. Datatype is both an instance of and a subclass of Class. Each instance of:Datatype is a subclass of Literal.
  • XMLLiteral – the class of XML literal values.XMLLiteral is an instance of Datatype (and thus a subclass of Literal).
  • Property – the class of properties.
Properties

Properties are instances of the class Property and describe a relation between subject resources and object resources.

For example, the following declarations are used to express that the property "employer" relates a subject, which is of type "Person", to an object, which is of type "Organization":

ex:employer rdfs:domain foaf:Person

ex:employer rdfs:range foaf:Organization

Hierarchies of classes support inheritance of a property domain and range from a class to its sub-classes:
  • subPropertyOf is an instance of Property that is used to state that all resources related by one property are also related by another.
  • Label is an instance of Property that may be used to provide a human-readable version of a resource's name.
  • Comment is an instance of Property that may be used to provide a human-readable description of a resource.
Utility properties

seeAlso is an instance of Property that is used to indicate a resource that might provide additional information about the subject resource.

isDefinedBy is an instance of Property that is used to indicate a resource defining the subject resource. This property may be used to indicate an RDF vocabulary in which a resource is described.

Tuesday, June 30, 2015

Search Applications - Concept Searching

Concept Searching Limited is a software company which specializes in information retrieval software. It has products for Enterprise search, Taxonomy Management and Statistical classification.

Concept Searching Technology Platform

The Concept Searching Technology Platform is based on our Smart Content Framework™ for information governance, and incorporates best practices for developing an enterprise framework to mitigate risk, automate processes, manage information, protect privacy, and address compliance issues. Underlying the framework is the technology to:
  • Automatically generate semantic metadata using Compound Term Processing.
  • Auto-classify content from diverse repositories.
  • Easily develop, deploy, and manage taxonomies.
The framework is being used to enable intelligent metadata enabled solutions to improve search, records management, enterprise metadata management, text analytics, migration, enterprise social networking, and data security.

Features
  • Compound terms are extracted when content is indexed from internal or external content sources, enabling the delivery of greater precision of relevant content at the top of search results.
  • Relevance ranking displays extracts from the documents based on the query.
  • Search refinement delivers to the end user highly correlated concepts that may be used to refine the search.
  • Taxonomy browse capabilities are standard.
  • Documents can be classified into one or more taxonomy nodes, enhancing the precision of documents returned.
  • In addition to static summaries, Dynamic Summarization, a modified weighting system, can be applied that will identify in real-time short extracts that are most relevant to the user’s query.
  • Related Topics will return results based on the conceptual meaning of the search terms used, using the ability to generate compound terms in a search. For example, ‘triple’ is a single word term but ‘triple heart bypass’ is a compound term that provides a more granular meaning.
  • Based on previous queries, or on extracts retrieved, end users can use the text to perform additional searches to retrieve more granular results.
  • The product is based on an open architecture with all API’s based on XML and Web Services. Transparent access to system internals including the statistical profile of terms is standard.
  • Highly scalable.
  • High performance specifically with classification occurring in real time.
  • Easily customized to achieve your organizations’ objectives.
Base Components in the Concept Searching Technology Framework

Conceptual Search Platform

conceptSearch, is Concept Searching’s enterprise search product and a key component in the Concept Searching Technology Platform. It is a unique, language independent technology and is the first content retrieval solution to integrate relevance ranking based on the Bayesian Inference Probabilistic Model and concept identification based on Shannon’s Information Theory.

Unlike other enterprise search engines that require significant customization with marginal results, conceptSearch is delivered with an out-of-the-box application that demonstrates a simple search interface and indexing facilities for internal content, web sites, file systems, and XML documents. Application developers experience a minimal learning curve and the organization can look forward to a rapid return on investment.

Because of the innovative technology, conceptSearch delivers both high precision and high recall. Precision and recall are the two key performance measures for information retrieval. Precision is the retrieval of only those items that are relevant to the query. Recall is the retrieval of all items that are relevant to the query. Yet most information retrieval technologies are less than 22% accurate for both precision and recall. The ideal goal is to have these features balanced. Compound term processing has the ability to increase precision with no loss of recall.

conceptSearch is particularly important for organizations that need sophisticated search and retrieval solutions. By weighting multi-word phrases, instead of single words, or words in proximity, the retrieval experience is more accurate and relevant. The ability for the search engine to identify concepts enables organizations to improve the search experience for a variety of business requirements.

Search Engine Integration

This functionality is provided via the Concept Searching Technology platform to integrate with any search engine. The Concept Searching Technology platform can perform as on the fly classification with search engines calling the classify API. Search engine support includes SharePoint, the former FAST products, Office 365 Search, Solr, Google Search Appliance, Autonomy, and IBM Vivisimo. If the FAST Pipeline Stage is required, this is sold as a separate product.

conceptClassifier

conceptClassifier is a leading-edge rules based categorization module providing control of rules-based descriptors unique to an organization. conceptClassifier delivers a categorization descriptor table, which is easy to implement and maintain, through which all rules and terms can be defined and managed. This approach eliminates the error-prone results of ‘training’ algorithms typically found in other text retrieval solutions and enables human intervention to effectively tune classification results.

Functionality is provided via the Concept Searching Technology platform, to classify documents based upon concepts and multi-word terms that form a concept. Automatic and/or manual classification is included. Knowledge workers with the appropriate security rights can also classify content in real time. Content can be classified from diverse repositories including SharePoint, Office 365, file shares, Exchange public folders, and websites. All content can be classified on the fly and classified to one or more taxonomies.

conceptTaxonomyManager

This is an advanced enterprise class, easy-to-use taxonomy development and management tool, still unique in the industry. Developed on the premise that a taxonomy solution should be used by business professionals, and not the IT team or librarians, the end result is a highly interactive and powerful tool that has been proven to reduce taxonomy development by up to 80% (client source data).

conceptTaxonomyManager is a simple to use, has an intuitive user interface designed for Subject Matter Experts, and does not require IT or Information Scientist expertise to build, maintain and validate taxonomies for the enterprise. conceptTaxonomyManager has the capability to automatically group unstructured content together based on an understanding of the concepts and ideas that share mutual attributes while separating dissimilar concepts.

This approach is instrumental in delivering relevant information via the taxonomy structure as well as using the semantic metadata in enterprise search to reduce time spent finding information, increase relevancy and accuracy of the search results, and enable the re-use and re-purposing of content. Using one or more taxonomies, unstructured content can be leveraged to improve any application that uses metadata. This flexibility extends to records management, information security, migration, text analytics, and collaboration.

Intelligent Migration

Using the Concept Searching Technology platform an intelligent approach to migration can be achieved. As content is migrated it is analyzed for organizationally defined descriptors and vocabularies, which will automatically classify the content to taxonomies, or in the SharePoint environment, the SharePoint Term Store, and automatically apply organizationally defined workflows to process the content to the appropriate repository for review and disposition.

conceptSQL

This product provides the ability to define a document structure based on information held in a Microsoft SQL Server. A document can include any number of text and metadata fields and can span multiple tables if required. conceptSQL supports SQL 2005, 2008, and 2012. A powerful but easy to use configuration tool is supplied eliminating the need for any programming. Templates are provided for out of the box support for Documentum, Hummingbird, and Worksite/Interwoven DMS.

SharePoint Feature Set

The SharePoint Feature Set includes the following components: farm solution with feature sets, Term Store integration, taxonomy tree control for editing, refinement panel integration, event handlers for notification of changes, management of classification status column, web service advanced functionality (implement system update or preserve GUIDS), automated site column creation.

Intelligent Records Management

The ability to intelligently identify, tag, and route documents of record to either a staging library and/or a records management solution is a key component in driving and managing an effective information governance strategy. Taxonomy management, automatic declaration of documents of record, auto-classification, and semantic metadata generation are provided via the Concept Searching Technology platform and conceptTaxonomyWorkflow.

Data Privacy

Fully customizable to identify unique or industry standard descriptors, content is automatically meta-tagged and classified to the appropriate node(s) in the taxonomy based upon the presence of the descriptors, phrases, or keywords from within the content. Once tagged and classified the content can be managed in accordance with regulatory or government guidelines.

The identification of potential information security exposures includes the proactive identification and protection of unknown privacy exposures before they occur, as well as real-time monitoring of organizationally defined vocabulary and descriptors in content as it is created or ingested. Taxonomy, classification, and metadata generation are provided via the Concept Searching Technology platform and conceptTaxonomyWorkflow.

eDiscovery and Litigation Support

Taxonomy, classification, and metadata generation are provided via the Concept Searching Technology platform. This is highly useful when relevance, identification of related concepts, vocabulary normalization are required to reduce time and improve quality of search results.

Text Analytics

Taxonomy, classification, and metadata generation are provided via the Concept Searching Technology platform. A third party business intelligence or reporting tool is required to view the data in the desired format. This is useful to cleanse the data sources before using text analytics to remove content noise, irrelevant content, and identify any unknown privacy exposures or records that were never processed.

Social Networking

Taxonomy, classification, and metadata generation are provided via the Concept Searching Technology platform. Integration with social networking tools can be accomplished if the tools are available in .NET or via SharePoint functionality. This is useful to provide structure to social networking applications and provide significantly more granularity in relevant information being retrieved.

Business Process Workflow

conceptTaxonomyWorkflow serves as a strategic tool managing migration activities and content type application across multiple SharePoint and non-SharePoint farms and is platform agnostic. This add-on component delivers value specifically in migration, data privacy, and records management, or in any application or business process that requires workflow capabilities.

conceptTaxonomyWorkflow is required to apply action on a document, optionally automatically apply a content type and route to the appropriate repository for disposition.

Wednesday, June 24, 2015

Thesaurus Principles

Thesaurus is necessary for effective information retrieval. A major purpose of a thesaurus is to match the terms brought to the system by an enquirer with the terms used by the indexer.

Whenever there are alternative names for a type of item, we have to choose one to use for indexing, and provide an entry under each of the others saying what the preferred term is. The goal of the thesaurus, and the index which is built by allocating thesaurus terms to objects, is to provide useful access points by which that record can be retrieved.

For example, if we index all full-length ladies' garments as dresses, then someone who searches for frocks must be told that they should look for dresses instead.

This is no problem if the two words are really synonyms, and even if they do differ slightly in meaning it may still be preferable to choose one and index everything under that. I do not know the difference between dresses and frocks but I am fairly sure that someone searching a modern clothing collection who was interested in the one would also want to see what had been indexed under the other. We would do this by linking the terms with the terms Use and Use for, like this:

Dresses
USE FOR
Frocks

Frocks
USE
Dresses

This may be shown in a printed list, or it may be held in a computer system, which can make the substitution automatically. If an indexer assigns the term Frocks, the computer will change it to Dresses, and if someone searches for Frocks the computer will search for Dresses instead, so that the same items will be retrieved whichever term is used.

Use and Use For relationships are also used between synonyms or pairs of terms which are so nearly the same that they do not need to be distinguished in the context of a particular collection. For example:

Nuclear energy
USE
Nuclear power

Nuclear power
USE FOR
Nuclear energy

Hierarchical Relationships

If we have a hundred jackets, a list under a single term will be too long to look through easily, and we should use the more specific terms. In that case, we have to make sure that a user will know what terms there are. We do this by writing a list of them under the general heading. For example:

Jackets
NT (Narrower Terms)
Dinner Jackets
Flying Jackets
Sports Jackets

In the thesaurus, BT(Broader Terms)/NT relationships can be used for parts and wholes in only four special cases: parts of the body, places, disciplines and hierarchical social structures.

Good computer software should allow you to search for "Jackets and all its narrower terms" as a single operation, so that it will not be necessary to type in all the possibilities if you want to do a generic search.

Related Terms

Related terms may be of several kinds:

1. Objects and the discipline in which they are studied, such as Animals and Zoology.
2. Process and their products, such as Weaving and Cloth.
3. Tools and the processes in which they are used, such as Paint brushes and Painting.

It is also possible to use the Related Term relationship between terms which are of the same kind, not hierarchically related, but where someone looking for one ought also to consider searching under the other, e.g. Beds RT Bedding; Quilts RT Feathers; Floors RT Floor coverings.

Definitions and Scope Notes

Record information which is common to all objects to which a term might be applicable. Where there is any doubt about the meaning of a term, or the types of objects which it is to represent, attach a scope note. For example:

Fruit
SN
distinguish from Fruits as an anatomical term
BT
Foods
Preserves
SN
includes jams
Neonates
SN
covers children up to the age of about 4 weeks; includes premature infants

Form of Thesaurus

A list based on these relationships can be arranged in various ways; alphabetical and hierarchical sequences are usually required, and thesaurus software is generally designed to give both forms of output from a single input.

Poly-hierarchies

a term can have several broader terms, if it belongs to several broader categories. The thesaurus is then said to be poly-hierarchical. Cardigans, for example, are simultaneously Knitwear and Jackets, and should be retrieved whenever either of these categories is being searched for.

With a poly-hierarchical thesaurus it would take more space to repeat full hierarchies under each of several broader terms in a printed version, but this can be overcome by using references, as Root does. There is no difficulty in displaying poly-hierarchies in a computerized version of a thesaurus.

Singular or Plurals

Thesaurus creation standards prescribe to use plural forms of nouns.

Use of Thesaurus

A thesaurus is an essential tool which must be at hand when indexing a collection of objects, whether by writing catalog cards by hand or by entering details directly into a computer. The general principles to be followed are:

1. Consider whether a searcher will be able to retrieve the item by a combination of the terms you allocate.
2. Use as many terms as are needed to provide required access points.
3. If you allocate a specific term, do not also allocate that term's broader terms.
4. Make sure that you include terms to express what the object is, irrespective of what it might have been used for.

If you have a computerized thesaurus, with good software, this can give you a lot of direct help. Ideally it should provide pop-up windows displaying thesaurus terms which you can choose from and then "paste" directly into the catalog record without re-typing. It should be possible to browse around the thesaurus, following its chain of relationships or displaying tree structures, without having to exit the current catalog record, and non-preferred terms should automatically be replaced by their preferred equivalents.

You should be able to "force" new terms onto the thesaurus, flagged for review later by the thesaurus editor. When editing thesaurus relationships, reciprocals should be maintained automatically, and it should not be possible to create inconsistent structures.

Thesaurus Maintenance

New terms can be suggested, and temporarily terms "forced" into the thesaurus by users. Someone has to review these terms regularly and either accept them and build them into the thesaurus structure, or else decide that they are not appropriate for use as indexing terms.

In that case they should generally be retained as non-preferred terms with USE references to the preferred terms, so that users who seek them will not be frustrated. An encouraging thought is that once the initial work of setting up the thesaurus has been done, the number of new terms to be assessed each week should decrease.

When to Use Thesaurus?

It is particularly appropriate for fields which have a hierarchical structure, such as names of objects, subjects, places, materials and disciplines, and it might also be used for styles and periods. A thesaurus would not normally be used for names of people and organisations, but a similar tool, called an authority file is usually used for these. The difference is that while an authority file has preferred and non-preferred relationships, it does not have hierarchies.

Authority files and thesauri are two examples of a generalized data structure which can allow the indication of any type of relationship between two entries, and modern computer software should allow different types of relationship to be included if needed.

Other Subject Retrieval Techniques

A thesaurus is an essential component for reliable information retrieval, but it can usefully be complemented by two other types of subject retrieval mechanism.

Classification Schemes

While a thesaurus inherently contains a classification of terms in its hierarchical relationships, it is intended for specific retrieval, and it is often useful to have another way of grouping objects. It is also often necessary to be able to classify a list of objects arranged by subject in a way which differs from the alphabetical order of thesaurus terms. Each subject group may be expressed as a compound phrase, and given a classification number or code to make sorting possible.

Free Text

It is highly desirable to be able to search for specific words or phrases which occur in object descriptions. These may identify individual items by unique words such as trade names which do not occur often enough to justify inclusion in the thesaurus. A computer system may "invert" some or all fields of the record, i.e. making all the words in them available for searching through a free-text index, or it may be possible to scan records by reading them sequentially while looking for particular words. The latter process is fairly slow, but is a useful way of refining a search once an initial group has been selected by using thesaurus terms.

Thursday, May 21, 2015

Importance of Taxonomy to Drupal

Drupal is a quite powerful content management system (CMS) that is similar to competitors like WordPress and Joomla. It is typically installed on a web server, unlike WYSIWYG (What You See Is What You Get) local programs like Adobe Dreamweaver (now part of Creative Cloud) and Microsoft FrontPage.

Drupal is an open source platform, meaning that publicly contributed extensions have been offered to extend functionality of the CMS. Part of the Drupal Core, taxonomy is integral to what web developers and programmers can or could do with the software. Taxonomy is a system of categorization, and Drupal can use taxonomy for a number of different purposes within its framework by using various techniques and tools available for the platform. Here, we will examine the basics of taxonomy in Drupal (what it means, how it’s used, etc.) and the various types of tasks that can be accomplished by taking advantage of taxonomy within the software.

What does taxonomy refer to in Drupal, specifically?

In Drupal, taxonomy is the core module that is used to determine how to categorize or classify content on the website being built with the CMS. It is also a critical element to the website’s information architecture, on both the back and front ends.

Taxonomies in Drupal have vocabularies associated with them. As part of a vocabulary list, this helps the CMS to determine what items belong with what types of content. So, further, vocabularies consist of terms. The list of terms defines the contents of the vocabulary. These can be part of a hierarchy or simply a compilation of tags. Tags group nodes (elements in Drupal sites that contain content; e.g. articles and basic pages) together. These can then be referenced with search on the website.

Sites built in Drupal can have an unlimited number of vocabularies, so complex sites can be built using the framework. The potential number of terms possible is unlimited as well. The vocabularies and terms associated with your website can serve a number of purposes, particularly for displaying content and managing content assets. It can also be important for reference as well.

Displaying content and manipulating taxonomies

Drupal users are able quickly and easily modify how content is displayed based on how taxonomical data is manipulated with modules, such as the Views module. The Views module manipulates how nodes are displayed within a block, panel or page. At the most basic level, Views can enable developers to display a list of articles that appear only on certain pages that are tagged with certain keyword phrases that make up taxonomy of the site.

For example, on Slanted Magazine Southern Minnesota Arts & Culture’s website, the navigation bar at the top of the site includes several categories of basic pages that are the site’s publishing sections (News, Tech, Arts, Entertainment, Music, etc.). When a section tab is clicked the link brings you to that basic page where a list of articles with teaser text appears. Those article collection displays were built using the Views module that applied filters to display content only tagged with certain phrases such as “tech” or “Music”.

Taxonomy and permissions or visibility

Taxonomy and metadata can also drive the site content visibility and permissions settings, as needed for diverse business needs. The goals of the organization will determine how best to use these settings and taxonomy can play a vital role in how information within the organization is shared (public, confidential, semi-confidential, etc.) with various parties.

There may be nodes or specific content that only certain members within the organization should be allowed to edit. By using the permissions in the administration page within Drupal, developers are able to acutely assign permissions and roles for registered users of the site. This will allow powerful flexibility because developers can assign roles and permissions based on the taxonomy data that has been put together in the Drupal site.

Also, there may be a need for the developer to modify content that the public is able to view. Using the core module taxonomy in conjunction with permissions is a great way to achieve this goal as well. Again, it will be determined by the specific goals of the organization, so important decisions about the usability and navigation of the site will need to be worked out (or at least should be) far in advance to building out these elements of the site. A great outline and wireframes can go a long way when developing a top notch website using the Drupal CMS framework.

Improving search through taxonomy
Search will no doubt be improved through the use of taxonomy within the CMS. Content that is tagged or classified using vocabularies and terms within the framework can be indexed by the Drupal Search module. Additionally, the taxonomy will make your site more marketable because commercial search engines like Google and Bing will able to more effectively crawl the website and make determinations about the site’s content, architecture, design and organization of the website files.

Using taxonomy as part of the Drupal system is a key element to designing a great website on the platform and making the information work smarter for organizations. That is ultimately the purpose of any type of taxonomy. The system and its modules are quite easy to learn to use as well and multiple ways of handling the data is possible. Also, since the software is open source, there is a great opportunity to learn from a community of developers and users. There is also a wide variety of extensions available to enhance features of the CMS and its output.

Monday, May 4, 2015

A Practical Guide to Content Strategy in Six Steps

A critical question you must ask yourself: what is your content strategy? Further, what do you plan to do with content assets you have and how do you take full advantage of that data?

There are many types of content, of course, and each group of assets may have a different strategy entirely. Let’s look at how you can identify that content, organize it and execute a strategy to handle it.

Step One: Identify Our Content

Let’s first start by identifying your content assets. What content do you have? How and why is it currently being used? Start by asking these kinds of questions to assess the content assets so you can later evaluate and organize that information into groups used in taxonomy (categorization of your content) and so forth.

Identifying your content is an important first step because, obviously, you have to know what you are working with before you can actually develop a plan to organize and use that available data to your advantage as an organization. Try to create some type of outline as you work through this.

For instance, you will likely want to look at all of your marketing content, employee policy content, customer and financial data and business operational data all separately. Find where all of this content lives (in the cloud, data center, computer hard drives, network drives, social media, email, wikis, etc.). This will help you move into the next crucial step of the content strategy process, which involves organizing all of your content and grouping it into categorical context.

Step Two: Organize, label, categorize

So now that you have identified all of the content within your organization’s hard (such as those in a file cabinet) and soft files (such as those in the cloud or stored on a computer), you can begin the critical steps of organizing, labeling and categorizing your content. This process involves creating an outline, hierarchy or taxonomical system for your content assets.

You will first want to start with a plan that outlines your organization’s goals for the content, with your overall mission in mind, so you will be able to develop a useful system of organization and taxonomy. Group your content assets within these groups and subgroups to create cohesion and transparency. One of the goals of your content strategy should be to make data easier to access for those with the proper access privileges. Each layer may have different privileges or added layers within. It is kind of like baking a complicated cake, using data for our ingredients.

Step Three: Develop targeted plans for each layer

Because you have these different layers of content, it only makes sense that you must plan a slightly or even widely different approach to each of those layers. For instance, your strategy for delivering employee policy and conduct information surely would not use the same approach as delivering customer marketing material to the public. They must be implemented with the user in mind.

Part of this is about identifying the user or audience in mind, but much of that process should have been already taken care of during the organization phase.These layers of taxonomy (content that is tagged or categorized for use in a particular context or definition of terms or navigation) can become increasingly complex and overwhelming, even for the most seasoned content managers, so be vigilant and stay focused on the overall strategy.

There are two good ways to do this. One is to make sure that you audit your content for consistency, accuracy, relevance (outdated information should be archived), mechanics, usability and design. The next is to conduct usability testing through each phase of the content management overhaul.

Step Four: Find a content management system that works for you

There are many different content management systems (CMS) that have varying levels of efficiency, complexity and advanced features for editing and managing your content. Each one is different and has a different learning curve.

Your job should be to find the one that works best for the purposes intended. Possible CMS include Drupal, WordPress, Joomla and several others for content like blogs, web portals and basic (or complex) websites. Sharepoint helps to manage document files. There are a number of different options depending on a particular need. You just want to make sure that your chosen system will allow you to categorize content effectively and make search easier.

Step Five: Employ good user design or user experience principles in design and navigation

It can’t be stressed enough. Make finding content easier for members of you organization. Make sure your content strategy involves looking at both form and function of content. A good information designer or graphic designer should not be underestimated. The work they do helps people navigate complicated websites or applications easier.

Designs should be clean and clear of clutter and complicated imagery. Icons and images should be displayed in the proper format so they don’t appear distorted. They should be easy to read, easy to find and easy to digest. Web users typically have little patience when it comes to looking around the page. You literally have seconds to grab their attention. Make it count.

Navigation structure and page elements should also be displayed logically and in a clean and clear manner to avoid confusion and congestion on pages. Also ensure that all navigation leads to relevant content that is useful for the intended audience.

Step Six: Employ analytics to make the most of your content

Lastly, when developing a content strategy and after all the other five steps have been completed (this is an ongoing though), you will be able to analyze your data. Using analytics tools to access insights about information can be critical to making your content strategy work for the organization. Look at how users clicked, where they clicked, what content was most accessed, how it was accessed and why. These insights will allow you to be nimble and make gradual changes over time to continually tweak the content management process.

Wednesday, April 22, 2015

Social Media Management and Information Governance

The social media landscape today has ballooned to include several different types of platforms from video or photo sharing to microblogs to short posts and activity feeds for all. With all of this newly introduced communication software, there becomes an increasing amount of data and data risk.

There are three layers of information governance involved with social media use within official organizations. Read on to learn what these layers are and what can be implemented within your organization to keep data compliant with legal, organizational and regulatory policies and procedures, as well as keeping data safe and free of risk.

Social Media Security

Organizations, including small and midsize businesses, non-profits, corporate enterprises, even governments, are no doubt being inundated with automatic cyber-attacks, hacks, spam, phishing scams, DDoS (distributed denial of service) attacks and other forms of electronic malware. Much of this malware also no doubt comes from social media use. Interestingly though, many organizations are not prepared or putting effort into scanning this content for malware stemming from social media use.

Short links distributed through tweets, wall posts and other forms of communication are generated by bots that are designed to appear human online, though they are not. The information gathered through deploying these bots can be devastating for an organization. Imagine that employee clicks on one of these links and critical business information becomes vulnerable to automated information harvesting.

This information can be used in a variety of ways including business or government espionage, theft of important customer or internal financial information, theft or distribution of important trade secrets like research or prototypes and illegal or compromising use of other critical data.

There are tools that can scan this content and monitor user behavior to ensure secure communications. One of the tools that can manage social media is HootSuite.

Social Information Archival

The archival of information is obviously important for any kind of enterprise or organization. Data can become stockpiled or deleted immediately on social media sites, depending on their own policies for data retention.

If an employee or member creates a piece of content that was deleted, there must be a way to retrieve when and why the content was removed. It may come up in a legal matter at some point (continue reading to see Social Media Information Policy).

Screenshots of content or documentation of social media activity are a couple of ways that this information may be monitored or recorded. Some kind of record needs to exist. A simple log may not suffice, depending on policy or regulations. Businesses with a supply chain, product or other third party scenario may need to refer to this information for business practices or other reasons effecting third parties or partners.

Social media insights can also be gained through tracking content and activity over long periods of time. Research into social use over time can enable organizations to become adaptable to market conditions, laws, disruptions, customer expectations, business practices and a broad range of other areas important to organizations using social tools and sites.

Social Media Information Policy

Organizations are more heavily burdened by legislation, regulation and threat of legal action or litigation than ever before. To complicate matters, the amount of information is growing ever more rapidly. As old data becomes archived, exponentially larger volumes of data are being produced. This trend is not going to slow down anytime soon. Just take a look at the massively growing market of cloud storage and computing services on the market. So how can we ensure that social media use follows guidelines?

It starts with auditing content, campaigns and procedures to ensure legal, regulatory and organizational compliance. Look at content to see if there are vulnerabilities. You don’t want users posting content that can lead to insider trading, for example. Trade secrets and confidential customer or supplier information must also not be distributed to the public, for another example.

These are just a couple of ways that this kind of media use can harm or injure the credibility, profitability and even viability of an entire enterprise. Information handling policies must be both set in stone for things that will not change (corporate responsibility, for example) and things that will change or evolve over time (product marketing, for example). Some things will in fact change quite rapidly, while others will be a little slower moving.

After the audit, the next step is to ensure enforcement. Not only management, but every single member of the organization must first understand that these policies are important and then see to it that they are being followed. Monitor all onsite or virtual network use and the use of social on those systems. Let users know that their activity is being monitored to dissuade them from engaging in the risky behavior to start with. Remember that the average employee spends nearly an hour engaging in social media use at work.

There are various risks associated with this activity. Employees must both know the risks associated but also understand that there will be no tolerance for non-compliance with these policies. Disciplinary action is at the discretion of each organization.

Implement the Layers Proactively

Remember that the sooner your organization starts implementing these layered tasks, the better. You don’t want to be comfortable today and sorry tomorrow for not realizing the mistake of complacency. Make sure that everyone is on-board at all levels to ensure the smoothest possible transition into security protocols, policies, procedures and use of tools and software.

People are often afraid of change or resistant to do things that require patience or more work on their end. You may be able to alleviate some of those pains from them, but ultimately everyone must be responsible for the information they produce, gather and distribute.

All this being said, social media is a great tool for boosting productivity as well as marketing efforts for most organizations, so don’t be afraid to use social media, just use these precaution measures first.

Thursday, April 16, 2015

Converting Knowledge Into Content

Many of us have grown accustomed to referring to our work email accounts to find that bit of information that we received from one colleague or another. Now you discover you need that information quickly to finish a project.

Where is it? If you have faced this similar situation, it means that the amount of data and its applications have grown more complex. It also likely means that you and your organization are too loosely exchanging important information and have lazy knowledge management practices in place. This is not meant to be insulting, of course. It is simply a way to understand how to improve the process.

As mentioned in previous blog posts, big data has come to encapsulate the work we do now. Wouldn’t it be great if we had a place to cleanly and robustly organize all of the information that we come across. Well, content and knowledge management policies and programs will help you to achieve this. Read on to learn how.

Where is knowledge?

Knowledge is everywhere. In your business or organization, it is likely fostered through learning and growing and experiencing the flow of the market and the culture. Each member, employee, manager, stakeholder and so forth has a different level of that experience, giving each of them unique knowledge. That knowledge is usually transferred through the relationships among staff via verbal exchange and electronic (email, memos, research notes, etc.) and hard (paper documents) means.

This is where our knowledge lives. But as it lives in the minds of the personnel and in fragmented pieces in various formats, how easily are organizations able to attain that knowledge and deploy it efficiently to achieve goals? It must be converted, therefore, into unified content and implemented with policies, procedures and strategies. So how do we do this?

Identify knowledge and outline a plan for documentation

Now that we know where our knowledge lives, we must formulate some type of plan to extract this information. At each layer, the process may be different. It might depend on your industry, your culture and things of this nature. In any case, the main point is to identify the information, record it and put it somewhere, preferably into a centralized system that makes use of taxonomy (see blog posts on taxonomy).

At some stage, this may require you to hire a technical writer or some other documentation specialist that can interview subject matter experts in your organization to get the detailed information that will serve as your organizations knowledge base. This person can identify with processes, components, procedures, policies, records, archived data, intellectual property, financial data, secure data and a broad range of other information that must live in an environment where the appropriate members or users can access it later.

In terms of legal matters or legal information, many regulations have information handling requirements that are rigid and may require that you have certain pieces of critical information readily available to audit. Knowledge management and content management are more important in this scenario than ever. Information audits can be done to sift through organizations’ data including email, memos and other documents to make sense and make use of them.

This process of making an audit of information and knowledge is an important first step, but the next step is just as important. You must now organize the information that you have so that it can be easily found by the right people.

Centralizing your converted knowledge and content

Documents stored as files in a simple network drive will no longer suffice as the volume and complexity increases. It is also a security problem. In the cloud environment, there are backups and options to monitor and distribute storage and speed. This makes converting knowledge into content easier when a content management system is deployed to quickly and efficiently handle all of that incoming information.

The type of content management system your organization will or should deploy depends very much on how the information will be used. It might turn out that you don’t use just one CMS. You might end up using multiple CMS options or configurations for different types of content or information. Of course sensitive information and information meant for the public should be handled differently and therefore should be managed differently.

Popular newsfeed or blog platforms include Drupal and Joomla. Oracle handles various IT and other types of content systems. There are systems like SharePoint that help users collaborate on word processing, spreadsheets, charts, presentations and other kinds of documents. There are hubs where users can go to find or share information with other colleagues within the company or organization.

Of course it is always important to take some time to think about security and access privileges and develop information handling policies and procedures within the company.

Make content searchable and organized

After the information has been properly disseminated and you found the right vendor to store and manage that content, you can begin the critical process of organizing it. Usually, within a content management system, we talk about taxonomy. Taxonomy is the process of categorizing or “tagging” content to make it searchable and displayed properly in results or views for the user. Tagging content within a content management system is an integral part of enabling the user, whoever that might be, to quickly locate bits and pieces or entire batches or wholes of information quickly and efficiently.

You must ultimately decide how to organize content in terms of how it will be used. Web content for the public, for example, may need to be audited for its SEO quality (how easily can it be found by search engines like Google, Bing, Yahoo, AOL, etc.). Your internal search systems, glossaries, thesauruses, style guides, policies, manuals and so forth will only be as good as their databases and program functionality as defined by the organization. Try to audit these systems for usability and continually try to improve the way information, both simple and complex, should be handled by internal systems.

Thursday, April 2, 2015

Collaboration Tools

Project management, idea generation and content organization is becoming an important element of general business practice in today’s information economy. There are several tools to get the job done and here we look at several different options but each has its own unique feature sets and configuration options for small and midsize businesses (SMBs) and large enterprises.

Project Management Meets Information Handling

Businesses need tools that will help them in their never ending quest for relevant information. Whether it is a small business with a small network, a midsize company with growing sets of data or an enterprise with both growing data and archived data, collaborative tools, apps or other software can help organizations meet the demands of handling complex or large volumes of information sufficiently. Project managers need tools that will help them enable their team or teams to work efficiently. An efficient workflow for that team might mean that they will need tools to communicate, share files, work from different locations in real time or analyze data in some way.

The cloud and big data have evolved over time together and they will continue to do so. Ensuring that your organization is employing a carefully constructed set of security policies and procedures, using cloud apps and other non-cloud or software configurations within your network (maybe a hybrid) will allow your organization to be flexible and powerful enough to grow revenues because an efficient workflow will allow you to save time, money and resources.

You may eliminate the need for paper, for one. But the real benefit for your organization is the need to improve the information handling ability. Finding content easier within a content management system (CMS) certainly helps, but finding the most relevant material and being enabled to put that content to use immediately is the real benefit.

When members of your organization are enabled to perform these tasks quicker, easier and more collaboratively, that is when the real work takes place. In fact, you will also notice more innovation, success and overall team involvement once that essential workflow is improved within your CMS and within your information governance policy (another topic discussed within this blog and website). Project managers will be happy. Those working on the project will be happy. All other stakeholders will be happy with the results you can now bring to the table.

The Right Collaboration Tools and Solutions for the Job

As previously mentioned, there are many tools to get the job done, but the focus here is on the best in collaboration and workflow management solutions that will help to move your business in the right direction – toward profits!

So, let’s take a brief look at what is currently available on the market to collaborate on projects. We will start with the paid big enterprise systems and move our way down the shelf to the options for small and midsize businesses (SMBs). Some of the least expensive and free options that are on the market may be sufficient but will lack the functionality, reliability, and support that paid project collaboration solutions provide.

Huddle

It is well worth watching the two minute video on the Huddle website. The Huddle collaborative system for project management, workflow and communication in real time is an industry leader for enterprise apps of this kind.

Huddle is a cloud-based app that allows users to work on various types of documents, track progress and delegate in a modular, modern interface that can update in real time so multiple users can work on a project at once. Huddle offers a secure space to distribute workloads.

SharePoint

SharePoint is by far the most widely used tool for collaborating on complex enterprise projects. SharePoint is a Microsoft product so it does have good support and can be used on a wide variety of machines, devices and networks and also works in the cloud. It can be extended with Yammer, a social network that allows your organization to create a social network or hub where users can connect and share ideas and plans with one another. It also integrates with OneDrive, Microsoft’s flagship cloud storage. There are also other apps to extend SharePoint for more configuration, functionality, and customization.

With SharePoint, users can administer projects and easily integrate content from their Microsoft Office apps. Like Huddle, SharePoint users have a secure environment to work and collaborate. The maintenance is automated for minimal downtime.

SAP

SAP seems to a bit different. Still an enterprise solution provider of collaborative apps for business, SAP tailors their solutions for specific industries such as aerospace, defense, banking, consumer products, engineering, construction, technology, energy and others. Their software comes pre-configured based on their professional evaluation of various industry needs. They also provide solutions for SMBs. Some of their solutions include analytics, cloud, big data, customer relationship management (CRM), security and others. Their business suite helps organizations deal with their content, share with others and create custom workflows for their operation. Pricing seems to be tied to the specific solutions provided by SAP.

Smartsheet

Their motto is “coordinate anything” and that is what you should be able to do with great project management and collaboration tools for your business. They offer their product to companies of any size but the enterprise will find many solutions tailored to their needs. They use an easy to understand dashboard with plenty of toolbars and customizable viewing and working options. It also integrates with other common platforms, making it a perfect fit for most corporations.

Google Drive and other cloud platforms

There are a number of different options for cloud storage including Google Drive, but that is not all these options offer. You can also connect a number of free (usually limited but upgradeable) apps designed to collaborate on projects. If you are using databases, spreadsheets, documents or other commonly available software, you could also design your own project management systems for use in the cloud. Make sure you use plenty of security precautions and set permissions. You should have a secure and private data policy for this. Google also offers Apps for Business and integrates its tools for use with companies.

By using any of these tools, you will help your company become more efficient. Take your time to look at each one of these software options with your IT department to ensure compatibility, proper cost to benefit ratio and other factors. Using a consultant to help you to navigate this new road of information security, inter-operability, and shared access is a logical step to ensure policies, procedures and programs are implemented properly.

Friday, March 20, 2015

Records Management Necessary for Organizations Small and Large

Records Management (RM) or Electronic Records Management (ERM) has become a more important system and solution for companies of all sizes, small and large, non-profits and other organizations. In the world of big data and the cloud, many small and midsize businesses are now facing the same kind of struggle that, at one time, only large enterprises faced: trying to wrangle entangled web of records. 

Record-keeping can be a nightmare, especially if it involves a lot of paper documentation. Document scanners have been introduced within the market to make the process of digitizing those documents easier, but a robust system of keeping digital records is still necessary to provide security, efficiency and consistency throughout the organization.

Business Objectives and Records Management

Records management objectives are related to achieving something the organization has set out to do or to save money and time. It could be both. Effective and efficient service is an integral part of this formula for RM objectives most of the time. Organizations should be working to make processes better for their customers or clients, but also for their internal members or staff. 

Avoiding heavy costs is always a goal for any organization, and cost avoidance is necessary to making a profit or staying in business. Social responsibility is one of the most important of objectives related to RM and includes moral, ethical and legal responsibility to maintain secure, confidential, and accessible records. Hospitals, government agencies, and other organizations must have good operating records management in order to serve the public interest.

Programs used for management of records should manage the information in a highly organized and easy to understand fashion so it can be timely, accurate, affordable, complete, usable and accessible. Business will run much smoother this way and less time and energy will be expended on searching for an important record or related documents.

The Growth of Data and Records

The growth in data these days is purely amazing and exponential. Because the amount of information and records will keep growing at all organizations, particularly successful ones, it will be necessary to manage the records effectively as well as efficiently.

Controlling the amount of paperwork is one thing, but managing the ability to generate more is another. This also includes digital forms. Effective ERM attempts to control creation of documents that may not add value to information or are duplicate. Retaining the records is important, however, so whatever program or method used must not be faulty or ineffective, otherwise there could be trouble with lost records and so forth.

The Preservation Effort is Continuous

Keeping records, like doing your taxes, is an ongoing process and one that can be continuously improved over time, either through internal processes or new program implementation (software).

Safeguarding vital data is important for nearly any organization, public or private. Comprehensive programs for protecting these important records from danger or disaster is essential because every organization is vulnerable to losses like this. Functioning as part of the records management program, vital records programs preserve integrity and confidentiality of the most sensitive data. They also keep these information assets safe according to a record protection plan.

Preserving the corporate memory or organization’s history is also important. An organization's data contains its institutional record, an asset important to the integrity of the institution but often overlooked by its various members. Each business day, records are created that could become templates or catalysts for future decisions or important plans. The records document the activities of your organization and may provide insight for future teams. They may also lead to other innovations.

Compliance and Legal Safeguards

Much of the records retention efforts made by companies or organizations have to do with compliance or avoiding lawsuits. Legal teams are often heavily involved in making recommendations on records management.

The United States is the most regulated country in terms of record keeping requirements across various industries in the private sector and agencies in the public sector. To ensure compliance, organizations need to follow a well-defined legal and organizational framework. Compliance laws can create major issues since they can be quite difficult to implement without a proper program or procedure. All members of the organization need to be kept in the loop, too. Failing to comply with these requirements and regulations could result in expensive costs for the organization such as fines and penalties.

Organizations also need to minimize risks of litigation. Implementing ERM program to deal specifically with records can reduce much of the liability associated with record keeping and document disposal or destruction. Routine and regular disposal at intervals in the cycle of doing business is vital to ensuring legal protection. Policies should be drafted to ensure these demands are met as well.

Better Business Performance Through Record Keeping

Business efforts that are streamlined are always better and lead organizations of all sizes and types to success if implementing the proper programs, policies and procedures. The same is true for documentation and record keeping.

Good ERM will certainly help reduce operating costs, even though that may seem like the opposite of the truth. It is likely that records management will save time and money, however, in the long run. Searching for lost records, including staff labor and other costs, can quickly damage the ability to efficiently run the operation. Organizations can save a lot of time and money by both investing in ERM and setting forth information policies that make sense. Staff time will also be spent more productively and staff stress will also be reduced by such efforts. An effective and usable index and filing system can make all the difference in the world to productivity.

Assimilating new RM technologies with existing processes can also streamline the organization’s information handling needs. The current technology should be audited to analyze its usefulness before implementing any new automated systems. Also, you may need to take a look at the manual way of doing it before automating to ensure reliability and proper functionality of the system. The manual procedure might even need improvement, so look there first. 

Wednesday, March 4, 2015

What is Usability and How Does it Relate to Business?

Your business is relying more heavily on technology than it ever has, and it is likely to continue in that direction. But in order for your technology to work for your business and make it successful, there must be at least some degree of usefulness to your technology.

That may sound like a no-brainer at first, considering that it is the entire reason your business has adopted technology. However, a surprising number of enterprises fail to adhere to usability and user design principles. Usability, as it suggests, is the field of studying a document’s usefulness to the user. How easy is the website to navigate? Is there enough white space? Is information structured logically? Are elements easy to find? These are just some of the questions a usability test might attempt to answer.

Schedule a Usability Test

If you want your organization to succeed, and you want to improve the quality of your information, consider scheduling a usability test with participants that will offer insight into your publications or information systems. The results of the test should help you determine the areas of your information and design that need to be improved. It can help establish what works and what doesn’t work. Web usability testing is the most common type of usability test, that is usability testing of a website. Any type of document or information system can be put to the test, however, internal or external. A consultant that is familiar with usability testing can test for these issues.

The Era of Responsive Design

User design has become a major development over the years in websites, applications and other information systems. The idea is to make information as easy and intuitive to understand and access as possible. Two terms have developed in this field: User interaction (UI) and User Experience (UX). UI refers to the usefulness of an application or site’s functionality as it relates to the user, and user experience refers to the form or design aspects that help the user locate information or appreciate the visual quality of the content. Usability testing can help determine if your UI/UX should be improved and how.

Eye Tracking

Eye tracking, believe it or not, has been employed as a way of studying usability for decades. Google uses eye tracking to study the interaction between a user and their behavior on a webpage to determine the best way to serve ads to users online. Eye tracking works by providing a heatmap that can visually display user behavior, such as where their eyes spent the most time observing elements on a page.

This data is important for organizations to understand where their users are going on their websites or other documents and how they are interacting with them. Most of the content in the right-hand column of many pages, for example, is often unnoticed or less noticed than content on the left, or at the very top of the page. Users spend very little time attempting to locate data online. On mobile devices, the time frame is even shorter. You must catch the user’s attention immediately, literally.

Information Context and Logic

Information should have context and proper logic if anyone is ever expected to try to understand it. Usability testing can help make sure that information is not only relevant and useful, but that its context and logic follow guidelines on how content should be created and implemented.

The information itself has to be of good quality to start with. Bad information tends to be ignored, especially in today’s culture of fast-paced computer-based interactions. The metrics for usability should not stop at the screen. In other words, it’s not all just about design. Design must be handled appropriately, but information quality, context and logic are just as important because that is the treasure inside the packaging that the user expects to acquire.

Information must follow logical order in terms of navigation, themes, key ideas, narration, order of operations, etc. These elements are extremely important to users who have expectations of how information should be presented to them. Their assumptions should guide your inspiration for method and mode of delivery.

Maintain Reputation and Trust

A part of usability that is rarely discussed is trust and reputation. These are important factors as digital technology becomes involved in nearly all facets of life today. Now, Google, the largest search engine and search enterprise, rewards websites with better search rank if they have a Secure Sockets Layer (SSL) certification. That shows you how important this has become. The Internet is threatened by criminals and others who are making legitimate business difficult with their scams.

Getting a SSL certificate is one way to maintain your security, reliability and reputation. It also will build some trust. Using sub-domains rather than an actual primary domain for your company is another reason users will not trust your business. Your business should have its own domain. Company information like addresses, phone numbers, staff names, company biographies and author biographies also help reinforce trust.

Other Elements for Usability

Information architecture is important for usability. One important element is page outline and mechanics. Users like to see information laid out on the page very organized and clear, so they can index the information for what they need and ignore what they don’t (immediately) need. Breaking content up into clear titles/subtitles, callout boxes, bulleted or numbered lists and other page design elements will improve the quality of your information and if it is hosted on a public website, it will even rank higher in Google and other search services.

Social media is also become part of the usability process. Is content easy to share online?

Also, examine the content’s links and other elements for consistency and clarity. Approach each element with a cautious editor’s eye to make sure that the elements are working as much to the user’s advantage as possible.