Showing posts with label Taxonomy. Show all posts
Showing posts with label Taxonomy. Show all posts

Wednesday, July 31, 2019

Taxonomy Development, Management, and Governance

Taxonomies do not exist in isolation. They exist within the context of multiple business processes. Taxonomies can take many different forms and they serve a wide variety of purposes in different organizations. 

A customer-facing search and browse taxonomy that describes a product catalog is a typical application for an e-commerce company while a taxonomy could provide a detailed profile of a scientific domain for indexing research content for a company focused on research and development. Website navigation, customer and employee profiling, inventory management, records management, writing, publishing and content management and site search are other possible taxonomy applications.

Efficient taxonomy management is the best facilitated by formally designating team members’ level of participation and responsibilities. Taxonomy management covers a broad range of activities and the most efficient use of team resources is achieved when responsibilities are clearly defined.

Taxonomy operations are typically performed by personnel with specialized training in library science or information management. The task of taxonomy governance are performed by taxonomy administrators. It is important to develop taxonomy change management procedures when taxonomy is being developed.

A well-governed taxonomy requires a time commitment from stakeholders. Participation in governance team activities is one manifestation of this but of greater significance is the impact that policies and procedures developed by the governance team have on stakeholders and business processes.

The size and precise makeup of taxonomy governance teams vary greatly depending on the size and complexity of both the organization and the taxonomy implementation. At one end of the spectrum a governance team might consist of a few individuals. In contrast, in an enterprise environment taxonomy governance might be one part of a larger data or IT governance organization made up of multiple teams.

It is also worth emphasizing that size is only one factor to consider when devising governance policies and allocating governance resources. For example, regulatory requirements vary widely across industries. It is completely appropriate for a business operating in a highly-regulated industry to dedicate a relatively higher proportion of resources to governance activities.

Governance efforts are more likely to fail because of human factors than technological ones. This means that a realistic assessment of organizational context is an important first step when creating a taxonomy governance team and setting expectations for taxonomy efforts. 

For example, significant disruptions to existing workflows typically result in poor compliance with governance policies. Identifying these potential pitfalls in advance is best accomplished by soliciting input from users at all levels of an organization. This is just one reason why the governance team must include representatives from all stakeholder groups, not just from leadership and project management.

In broad terms representatives from management and business groups, information technology, taxonomy management and taxonomy users come together on the governance team to serve as advocates for their respective groups.

Because of the wide range of potential applications, taxonomy management can be the responsibility of an equally wide range of groups. Information technology groups, user experience and web design groups, libraries, and a range of marketing and business groups are all potential homes for taxonomy management. A taxonomy governance team needs executive sponsors and management representatives who can provide high-level guidance and steer taxonomy efforts in a productive direction for the business as a whole.

All members of the taxonomy governance team should contribute to the creation of a high-level strategy but this is a task for executive sponsors and business decision makers.

Following are some of the important questions to answer during taxonomy development. Taxonomy implementation will be very different depending on the answer to these questions:
  • Given that most large organizations have multiple applications that use taxonomies, will a single, multipurpose enterprise taxonomy be created and maintained or will multiple specialized taxonomies be used?
  • How will different taxonomy applications be prioritized? Given multiple taxonomy users, how will resources be allocated and how will taxonomy projects be funded?
  • Will there be a central taxonomy management group?
  • How will taxonomy goals be defined and what metrics will be used to measure success?
  • How will new and emerging technologies and trends be evaluated and potentially incorporated?  
A taxonomy deployment impacts many different groups within an organization, which means that conflicts over priorities and resource allocation are not unusual. Awareness of potential conflicts and a transparent decision-making process helps to minimize the strife between stakeholders. Managing the relationships between stakeholders is the single most important task of leadership representatives on the governance team. Leadership representatives on the governance team should include both executive sponsors and business group personnel who can provide insight into business processes and business needs.

Technical support is crucial for successful taxonomy implementation and use. Strategic and business goals must be realistic given an organization’s technical capabilities and constraints. The primary role of taxonomy governance team representatives from technology implementation and support groups is to provide the expertise needed to ensure that business goals align with technical reality.

Taxonomy implementations range from a small number of terms applied through a web publishing platform and managed in a spreadsheet to highly specialized taxonomies consisting of thousands of terms and relationships that are managed with dedicated software and support dozens of consuming systems.

Obviously, the specific details have a significant effect on technical requirements. Many taxonomy management systems provide tools for workflow and governance modeling and enforcement. Alternatively, if the taxonomy is maintained and applied from within a content management system, then the governance team should determine an appropriate level of control and develop mechanisms to implement it.

It is important not to underestimate the work needed to integrate taxonomy management with consuming systems. The reality is that most organizations have a mix of consuming systems. Development resources are required in all of these scenarios and input from technical stakeholders is needed when planning and prioritizing implementation and ongoing maintenance. At the beginning of a taxonomy implementation, technology questions should be on defining technical solutions based on business objectives.

Some of the questions technical stakeholders help to answer include:
  • Adapting existing processes and technology versus building or buying new ones.
  • In-house development of taxonomy management tools versus purchase of third-party tools.
  • Integration requirements for taxonomy management with consuming systems.
As a taxonomy implementation matures, the technical emphasis shifts from implementation to ongoing maintenance and support, as is typical in the software life cycle.

Technology stakeholders are typically in-house staff, although it is not unusual for contractors to be part of the team, especially during tool development and implementation stages when the workload may be significantly higher.

Taxonomy management consists of the initial creation of taxonomies and related vocabularies and their maintenance over time. The responsibility of taxonomy management personnel is to execute policies created by the governance team, report to the governance team on taxonomy status and performance, and provide expert advice on taxonomy capabilities to inform decisions on future taxonomy development.

The tasks that are part of initial taxonomy development are quite different from those that are required during ongoing maintenance and administration. Those differences may require changes in emphasis on the part of the governance team, including team make-up and activities, depending the stage of the taxonomy life cycle.

Taxonomy development should be driven by business requirements, working within organizational and technical constraints. Both requirements and constraints should be defined by the governance team, thus the taxonomy management representatives on the team must be sufficiently conversant in both business and technical issues to productively collaborate with team members from other disciplines. Next, execution of taxonomy development will require collaboration between taxonomists and subject matter experts to create vocabularies that represent relevant concepts using terminology that is accurate and meaningful to users.

Some of the questions that taxonomy management staff will answer for the governance team include:
  • What specific taxonomies are required to meet business needs?
  • Will these taxonomies need to be developed from scratch or can existing taxonomies be reused?
  • Are there vocabularies, organizing principles or other classification methods currently in use within the organization that can be harvested and reused?
  • Are there standard domain-specific taxonomies, thesauri, or ontologies that will satisfy the requirements, either as is or with modification?
  • Are implemented taxonomies meeting user and business needs?
  • What changes are needed to improve taxonomy performance?
Staff for both taxonomy development and administration can be either in-house or provided by a consultant. Staffing needs vary greatly between organizations and details of the taxonomy implementation should be considered carefully when staffing decisions are made. The initial development and implementation of specialized taxonomies can be a substantial amount of work and it is common to make use of consultants for this phase of the project.

However, the costs for long-term administration should not be underestimated. Costs rise when organizations do not anticipate staff and resources needed for taxonomy maintenance. More importantly, without maintenance, taxonomies will atrophy and the value they provide to the organization is greatly diminished. Taxonomy management representatives provide the governance team with accurate assessments of taxonomy status as well as short and long-term resource needs.

The list below describes the functional roles performed by a taxonomy governance team and lists the team members who are typically associated with a given role. The individuals fulfilling the roles will vary depending on the structure, management philosophy, and staffing model of the organization so these descriptions should be considered as general guidelines rather than specific job titles. It is also not uncommon for an individual an on the team to play more than one role.

Executive Sponsors - provide strategic guidance, advocacy and support for taxonomy projects within the organization.

Business Decision Makers - identify business objectives, resolve cost/benefit issues and oversee resource allocation for taxonomy projects.

Technology Implementation and Support - develop and support taxonomy management tools or manage integration of third-party tools with relevant systems and organizational IT infrastructure.

Taxonomy Management - responsible for high- and low-level execution of taxonomy strategy and day-to-day taxonomy administration. May be an in-house team, an outside consultant or a mix.

Taxonomy Consumers - systems, groups, and individuals that use taxonomy in their day-to-day business operations. Typical consumers include content management, content strategy, user experience and web design, writing and publishing, site search, SEM and SEO, and business intelligence.

Subject Matter Experts - provide expert advice on intellectual domains, business processes, and other subject areas described by organizational taxonomies. Subject matter experts may or may not also be taxonomy consumers.

There is no universal taxonomy governance solution. Rather, effective governance achieves an important set of general goals while recognizing the unique features of an organization. Establishing a taxonomy governance Team is very important.

Galaxy Consulting has 18 years experience in taxonomy development, management, and governance. Please call us today for a free consultation.

Friday, February 22, 2019

Taxonomy Governance

When organizations have the need for a taxonomy, they focus on taxonomy development and they do not take into consideration the need for taxonomy governance. Taxonomy governance is part of information governance and should be taken seriously.

Taxonomies exist to support business processes and the associated organizational goals. A well-managed taxonomy provides the structure needed to manage content across multiple internal systems and gives users options and flexibility for how content is accessed and displayed. Taxonomy governance plans ensure that the taxonomies are maintained in a way that satisfies current and future needs and provides the maximum return on investment.

Taxonomy governance consists of the policies, procedures and documentation required for management and use of taxonomies within an organization. Successful taxonomy governance establishes long-term ownership and responsibility for taxonomies, responds to feedback from taxonomy users, and assures the sustainable evolution of taxonomies in response to changes in user and business needs.

Taxonomies are never “finished.” Rather, they are living systems that grow and evolve with the business. Taxonomy governance ensures that growth happens in a managed, predictable way.

Taxonomy governance answers the following questions:
  • Who are the taxonomy stakeholders?
  • What are their respective responsibilities?
  • Who is responsible for making changes?
  • What is the process for making changes?
  • How are prospective changes evaluated and prioritized?
  • When are changes made?
  • When are processes reviewed and updated?
The goals of taxonomy governance are similar across organizations but it is important to remember that there is no universal taxonomy governance solution. Successful taxonomy governance works within the context of the organization.

Many of the principles and goals of taxonomy governance are shared with information governance.

A good first step when developing taxonomy governance policies is to examine related information governance policies that already exist within an organization. Re-purposing familiar policies and systems makes both adoption and compliance easier for taxonomy users.

The best governance policies take advantage of existing structure, workflows and management processes while accounting for human and technical resources and constraints. Governance policies provide a strategic framework to guide day-to-day taxonomy management.

The main components of this framework are the taxonomy management organization and the operations they perform. Governance has a role at both strategic and operational levels by defining roles and responsibilities of taxonomy organization members, articulating communication, decision-making and escalation policies and providing protocols for taxonomy maintenance operations. Above all, governance provides accountability for decision-making and operations on both a large and small scale.

Taxonomy Management

Ongoing maintenance and development of a taxonomy is best achieved by a formal organization with well-defined and clearly documented roles, responsibilities, and processes. The Taxonomy Management team should be responsible for both strategic direction and routine administration of taxonomy operations. This team should include high-level decision-makers as well as trained taxonomists and IT if needed. End users of the taxonomy should also be represented in the Taxonomy Management team.

The role of a taxonomy governance team is to ensure that taxonomy management occurs in a systematic, measurable, and reproducible way. It provides a mechanism for managing the needs and concerns of all taxonomy stakeholders and helps maximize the value of taxonomy resources by establishing organization-wide policies for taxonomy development, maintenance and use.

Taxonomy Management Team manages taxonomy administration and development. As with governance policies in general, the specific makeup and divisions between teams as well as the terminology used to describe them will vary depending on the particulars of organizational structure, history and goals.

Taxonomy governance focuses on strategic goals and company-wide policies for taxonomy management and use as well as levels of responsibility for different taxonomy stakeholders. These goals and policies are developed by the Taxonomy Governance Team.

Identifying and documenting organization-wide taxonomy use cases is very important task of taxonomy governance activities. Taxonomies can potentially be used in multiple business areas. Content strategy, web design and user experience, marketing, customer support, site search and business intelligence are a few examples. Developing tangible, specific use cases helps communicate the taxonomy’s value throughout the organization and is necessary when prioritizing taxonomy-related investments.

Governance policies should also be developed that define taxonomy success, performance and quality. Metrics should validate the quality of a taxonomy implementation through quantifiable, direct measurement of taxonomy performance. Regular assessment ensures that the taxonomy meets business and user needs over the long term.

The ability to share data across systems, improved quality of search results, improved user experience of websites and regulatory compliance resulting from effective record keeping and document management are all examples of benefits that can result from effective taxonomy implementation and management. A goal of governance should be to identify and document benefits of this type that are relevant to the specific organization.

Taxonomy Operations and Maintenance

Ongoing maintenance is very important aspect of a taxonomy project. Taxonomies must be continually updated to reflect changes in content, competition, and business goals. In the absence of maintenance taxonomies atrophy and the value they provide will be greatly diminished.

Organizations must anticipate the resources needed to maintain the taxonomy and develop effective management processes to realize the maximum value from their taxonomy investment. At this level governance is primarily focused on operational details. It provides the framework for taxonomy operations in the form of guidelines, processes, documentation and a defined organizational structure.

The specific tasks performed as part of taxonomy maintenance consist of a wide range of large and small-scale changes to the taxonomy. Taxonomy staff are also typically responsible for providing training, preparing documentation materials, interacting with IT groups to ensure smooth operation of taxonomy systems and providing expert advice and feedback to business leaders to inform strategic decision-making.

The Taxonomy Change Process

One of the most important purposes of taxonomy governance is to define the organizational taxonomy change process. Governance policies define and document specific taxonomy changes and provide guidance to taxonomy administrators on making those changes.

It is especially important to provide guidance on decision-making authority and escalation processes. Defining and documenting different change types allows rational decisions to be made as to which changes can be routinely handled at the discretion of taxonomy administrators and which changes require higher-level consensus and approval. The first step in defining a taxonomy change process is to categorize taxonomy changes by impact and scale.

An important consideration in categorizing the impact of changes to the taxonomy is that taxonomy data is often used by multiple internal tools and systems. Content management, marketing, web analytics and SEO, product inventory and web publishing systems are just a few potential consumers of an enterprise taxonomy.

Experience shows that the level of engagement with the taxonomy team varies widely between users. To avoid unpleasant surprises, taxonomy administrators should be proactive in tracking users and systems where taxonomies are used. Understanding and documenting both the technical details of how taxonomy data flows to these systems and the specific business use case of various users is an important part of the taxonomy change process and should be addressed in both change processes and communication plans.

Small-scale changes will affect only a single term or small number of terms and will have a minimal impact on users and systems where they are used. Typical small-scale changes are spelling corrections or the addition of individual terms to existing vocabularies.

Taxonomy management staff is usually empowered to make this type of changes as part of routine taxonomy administration. In contrast, large-scale changes will impact large numbers of taxonomy consumers, multiple consuming systems and/or require a significant commitment of taxonomy management resources for an extended period of time. They require high-level approval with input from the entire information governance team.

Change Request Process

Typical sources of taxonomy change requests are users feedback, routine maintenance by taxonomy administrators, and new business needs.

User feedback is usually the largest and most important source of small-scale taxonomy change requests. A channel is needed for users to provide feedback and for taxonomy administrators to communicate with users. Interacting with taxonomy users and serving as a general point of contact for taxonomy issues is one of the most important aspects of routine taxonomy maintenance for taxonomy administrators.

Email aliases, bug/issue tracking software, dedicated portals, message boards, and other tools used in a help desk or customer support setting are all potentially useful mechanisms for taxonomy administrators to interact with users. Governance policies should address these needs with a well-defined communications plan.

It is also common for predictable events to have an impact on the taxonomy. Marketing campaigns, product updates, new products, company reorganizations and mergers are a few examples of events that could lead to taxonomy changes. Changes of this type can be significant in terms of scale but they can usually be handled as a routine part of taxonomy maintenance. These events should be identified and relevant change and communication policies developed.

In contrast to small-scale changes, large-scale changes tend to be infrequent and are typically driven by strategic business needs. Major expansions in scope requiring the creation of large numbers of new terms and implementation of significant new systems or technologies are examples of large-scale taxonomy changes that may be needed.

Difficulty and scale of taxonomy changes is dependent on the specific details of its implementation. Management of the taxonomy with a dedicated taxonomy tool versus within a content management system, the capabilities of the tool being used, the number and complexity of taxonomy use cases and the number and characteristics of consuming systems are a few variables that will influence the change process.

Collecting statistics on change requests and taxonomy use should be part of taxonomy administrator’s routine responsibilities. This data should be reported to the governance team and used to inform strategic decision-making. In the same way decisions made at the strategic level will impact the prioritization and performance of day-to-day tasks.

Maximizing ROI on Taxonomy Investments

Quality control mechanisms are an important function of governance, especially for businesses that operate in highly regulated environments, but they are not the only, or most important purpose of governance.

The high-level goal of taxonomy governance is to maximize the return on taxonomy investments. The taxonomy governance team establishes strategic goals for the taxonomy and develops organization-wide policies for taxonomy management and use designed to meet those goals.

Goals, policies and procedures should not only be designed to mitigate risks but also to improve organizational performance and capabilities. An enterprise taxonomy is used by many different individuals, groups, and systems and can impact multiple business processes. All of these stakeholders should have insight into taxonomy management processes and a mechanism to provide feedback. Because of the breadth of business processes using the taxonomy it is also important that the governance team include high-level representation to provide strategic guidance and advocacy for taxonomy operations. In return, the governance team must communicate the positive benefits to stakeholders so that policies are more than just vague background noise.

One of the most important tasks of a governance team is to communicate these policies and procedures in a positive way. Governance is often perceived as an enforcement mechanism and it’s natural for stakeholders to react defensively if they believe that policies are in place because they’re not trusted to produce high-quality work. Processes, standard operating procedures, responsibility matrices and so on are viewed as a an active obstructions to productive work.

Galaxy Consulting has 20 years experience in taxonomy development and taxonomy governance. Please contact us for a free consultation.

Thursday, May 21, 2015

Importance of Taxonomy to Drupal

Drupal is a quite powerful content management system (CMS) that is similar to competitors like WordPress and Joomla. It is typically installed on a web server, unlike WYSIWYG (What You See Is What You Get) local programs like Adobe Dreamweaver (now part of Creative Cloud) and Microsoft FrontPage.

Drupal is an open source platform, meaning that publicly contributed extensions have been offered to extend functionality of the CMS. Part of the Drupal Core, taxonomy is integral to what web developers and programmers can or could do with the software. Taxonomy is a system of categorization, and Drupal can use taxonomy for a number of different purposes within its framework by using various techniques and tools available for the platform. Here, we will examine the basics of taxonomy in Drupal (what it means, how it’s used, etc.) and the various types of tasks that can be accomplished by taking advantage of taxonomy within the software.

What does taxonomy refer to in Drupal, specifically?

In Drupal, taxonomy is the core module that is used to determine how to categorize or classify content on the website being built with the CMS. It is also a critical element to the website’s information architecture, on both the back and front ends.

Taxonomies in Drupal have vocabularies associated with them. As part of a vocabulary list, this helps the CMS to determine what items belong with what types of content. So, further, vocabularies consist of terms. The list of terms defines the contents of the vocabulary. These can be part of a hierarchy or simply a compilation of tags. Tags group nodes (elements in Drupal sites that contain content; e.g. articles and basic pages) together. These can then be referenced with search on the website.

Sites built in Drupal can have an unlimited number of vocabularies, so complex sites can be built using the framework. The potential number of terms possible is unlimited as well. The vocabularies and terms associated with your website can serve a number of purposes, particularly for displaying content and managing content assets. It can also be important for reference as well.

Displaying content and manipulating taxonomies

Drupal users are able quickly and easily modify how content is displayed based on how taxonomical data is manipulated with modules, such as the Views module. The Views module manipulates how nodes are displayed within a block, panel or page. At the most basic level, Views can enable developers to display a list of articles that appear only on certain pages that are tagged with certain keyword phrases that make up taxonomy of the site.

For example, on Slanted Magazine Southern Minnesota Arts & Culture’s website, the navigation bar at the top of the site includes several categories of basic pages that are the site’s publishing sections (News, Tech, Arts, Entertainment, Music, etc.). When a section tab is clicked the link brings you to that basic page where a list of articles with teaser text appears. Those article collection displays were built using the Views module that applied filters to display content only tagged with certain phrases such as “tech” or “Music”.

Taxonomy and permissions or visibility

Taxonomy and metadata can also drive the site content visibility and permissions settings, as needed for diverse business needs. The goals of the organization will determine how best to use these settings and taxonomy can play a vital role in how information within the organization is shared (public, confidential, semi-confidential, etc.) with various parties.

There may be nodes or specific content that only certain members within the organization should be allowed to edit. By using the permissions in the administration page within Drupal, developers are able to acutely assign permissions and roles for registered users of the site. This will allow powerful flexibility because developers can assign roles and permissions based on the taxonomy data that has been put together in the Drupal site.

Also, there may be a need for the developer to modify content that the public is able to view. Using the core module taxonomy in conjunction with permissions is a great way to achieve this goal as well. Again, it will be determined by the specific goals of the organization, so important decisions about the usability and navigation of the site will need to be worked out (or at least should be) far in advance to building out these elements of the site. A great outline and wireframes can go a long way when developing a top notch website using the Drupal CMS framework.

Improving search through taxonomy
Search will no doubt be improved through the use of taxonomy within the CMS. Content that is tagged or classified using vocabularies and terms within the framework can be indexed by the Drupal Search module. Additionally, the taxonomy will make your site more marketable because commercial search engines like Google and Bing will able to more effectively crawl the website and make determinations about the site’s content, architecture, design and organization of the website files.

Using taxonomy as part of the Drupal system is a key element to designing a great website on the platform and making the information work smarter for organizations. That is ultimately the purpose of any type of taxonomy. The system and its modules are quite easy to learn to use as well and multiple ways of handling the data is possible. Also, since the software is open source, there is a great opportunity to learn from a community of developers and users. There is also a wide variety of extensions available to enhance features of the CMS and its output.

Thursday, January 29, 2015

Taxonomy in the Age of Agile and Shadow IT

Marketing Meets IT and Merges

Information presented to customers must be capable of meeting near-instant information demands in a multitude of perspectives. This includes all end users, internal and external. Organizations will need to continue to be versatile and clever in their approach to data management.

Digital marketers are becoming quite clever in dealing with data, using it to persuade their (or their clients’) customers or potential customers. Taking principles from usability, marketers now use terms such as “user design”, “user experience” (UX) and “user interaction” (UI) and develop specialty roles to turn data into the most pleasant user experience possible. The food, beverage and hospitality industries aren’t the only ones in the experience this issue. Every industry is.

Build to Further Understand

Taxonomy and information architecture is not just about designing a great way to organize content for the end user. Taxonomy can also be used to understand data. Information, its structure and agility are keys to modern design techniques where such vast volumes of data exist.

Your organization can benefit from developing taxonomy for your organization’s information. Designing your information flow isn’t always as easy as it sounds. Consulting with a trusted professional that can analyze various aspects of your business is often needed to alleviate the stresses of a complex business taxonomy. A specialist can take your data and help you make sense of it. Whether you are a hospital that needs to define protocols for accessing patient data or a retail website seeking comprehensive analysis of information about web traffic, an internal audit of information systems can help to get you on the right track toward efficiency.

Start with assessing your digital (and even non-digital) tools to determine problem areas within the organization such as incomplete records or inconsistent rules or terms. The way in which systems communicate with customers, employees or other stakeholders is important to consider as well. Check that these systems can perform essential functions properly and that proper access and other rules are clearly defined. A master data management solution will help with this. Many fortune 1000 companies are going this route to deal with their organization’s information.

Agile Systems Will Assist in Achieving Maximum Comprehension

The right information asset management tools make all the difference. Having software for terms and concepts will provide users within the organization the right context for use. Thesaurus management, ontology software, metadata or cataloging software, auto-categorization, search, and other tools used in concert will keep the stability.

For example, in dealing appropriately with taxonomy, an agile system including auto-categorization and search tools (including text mining) would contain pre-installed user editable and non-editable taxonomies, be able to auto-generate editable taxonomies, support import of editable and non-editable taxonomies. To be agile for any number of end user type, these must be able to play out in several different varying combinations.

The same principle applies to other software categories like content management or thesaurus software. Search functions, for example, are more useful to the end user when they contain spell-checking functions or multiple display options. Remember, these principles apply to both the internal and external users. Remember that information architecture mastered on the inside will translate better to the outside.

Taxonomy and information architecture should be the foundation of an enterprise view of the customer. So how must an organization view its own vocabulary? Ensure that your master data management is interpreting terms consistently and includes context for those terms. Without context, people are often left searching for clues instead of getting the information they came for.

Maintaining Culture While Establishing Order

The above phrase sounds more like a political statement than one of taxonomy. However, when you think about it from a business perspective, it makes some sense. When you are implementing your agile system for taxonomy and information architecture, you don't want to disrupt the critical flow of business nor the information that is required to actually do business.

What you do want is to be able to open the pipeline of information further to increase productivity and enable efficient processes within the organization. That statement sums up the general need to implement an agile system for handling information.

Part of the solution to this is governance and compliance controls. By introducing hard controls for governance and compliance, you are forming a backbone with controls for how systems are using and integrating data. Your taxonomy, metadata and other information may connect business processes or even use content to complete or help complete a variety of tasks.

The exact structure of any organization varies from enterprise to enterprise and in parallel does their culture. This contrast can be reckoned with reason. The key to being able to harness information collectively, selectively and to varying degree is what will make the major difference in opening that pipeline up with controls in place to establish order.

Taxonomy Soup: Collaboration, Integration and Access

Here is analogy to social science: Just as the language of a region or culture may vary in dialect, so too does the language of business. The language of business can be quite diverse. Between industries, organizations, departments, fields of study/practice and a wide variety of other factors, confusion exists. The real benefit to a system with agility is the ability to communicate more efficiently. The combination of collaboration, integration, and access are the key ingredients to making the perfect taxonomy soup.

The ability to sort terms is very important and will become even more important in the big data era. A great system will be one that can differentiate taxonomy with due diligence. Collaborating will become more efficient this way.

By integrating data that does not fit into the dialect of terms, the organization will be able to make better use of its information assets, whatever they may be. This includes getting all of the information into the right places and ultimately into the right hands in the correct way. Policies and procedures are important examples of such data.

Analyze Your Needs Carefully

Take a look around your organization. Take notes on every detail you can to make an informed decision about what to implement and where. Consulting with a professional is the next step. The aforementioned details of creating an agile environment for taxonomy and information architecture within an enterprise of nearly any size are helpful in beginning to form a strategy for handling your enterprise information. Consulting a professional will help alleviate the overwhelming and burdensome task of data complexity.

Monday, January 12, 2015

Making Information Easier to Find Becomes Ever More Important

Taxonomy is becoming so much more important in the digital age that entire enterprises may one day develop out of the need just to classify information. The many ways we have traditionally classified content has exponentially grown in the digital age to a size un-imagined and continually growing. 

The Library of Congress and other libraries, large and small, have gone to using digital tools to classify and re-classify information about books, documents, texts and even multimedia content. The Internet was, of course, developed to help more easily share and organize documents and other content across a computer network. Now, here we are with the cloud and big data.

Where to begin?

The proportion of data-to-enterprise, or even data-to-individual, can become difficult or even unmanageable without the right tools or experience to guide you. As humans and consumers, we tend to expect that our options will be categorized into specific types based on the larger type. For instance, if we buy a computer, the choices are usually as follows: brand, device type, operating system, etc. 

Then we get into Apple vs Dell, Desktop vs Tablet or BlackBerry vs Android. The more immediate platforms that come to mind are Google or Bing search engines, hashtags or networks on social media like Twitter and Facebook. These are the most recent consumer examples of classifying information in a multitude of ways using software. Enterprises of all industries, however, are becoming more dependent on systems to help them manage information to scale.

The time it takes members of an organization to find important or relevant information is productivity lost. It also adds to personnel frustration, even at management level.

Time to Give Industries Options for Information Management

Companies are responding to the needs of industry. Taxonomy, metadata, ontology, data virtualization and data governance are some of the key areas of need for many organizations dealing with vast amounts of data coming from customers, partners, legal or other channels.

Top Quadrant, who released a web-based taxonomy solution recently, is an example of how these enterprise needs are so far being addressed, according to KMWorld. 

TopBraid, the software referenced, is able to help end users reference data with more easily accessible visual models of the data, laid out in a clean way. 

Much more emphasis on visual representation of data is becoming an IT industry-wide way to tackle some of the problems associated with extrapolating and explaining complex data sets. Asian countries have had a great deal of success, in fact, in using visual models to teach mathematics and transition students into new topics easier, which Americans have had some difficulty with in many educational settings.

TopQuadrant is just one recent example. There are tools and software being developed in the market to deal with this exponentially growing challenge.

Taxonomy Time for Taxonomists

So what does a taxonomist do that can help arrange and set a standard for all of this enterprise information that we are dealing with in ever increasing amounts? Well, for one, taxonomists are tasked, not only with categorization of terms, but also governance and definition of those terms as well. 

They often use a commercial software that is dedicated to this work, such as a dedicated thesaurus or taxonomy management application. Some of these can be developed internally as well, for the right organization, as long as it fits their particular needs. 

Sometimes taxonomy management tools are part of another suite or software, in which taxonomy is a feature. In the case of Drupal, a website content management software tool to build and maintain websites, taxonomy is used to define or classify content, which can then be configured to display nodes, pages, etc. to the end user. Sometimes, other software can be used, such as spreadsheets or other types of software tools. 

Lastly, open source software for taxonomy and ontology are becoming available for use as well.

The benefits to having a system or person that maintains taxonomy within the enterprise are several. One benefit is that information is organized, as I have already alluded to. Another is that this information can also be made easier to find for customers, personnel, vendors, supply partners, etc., which I have also discussed. One reason that we have not discussed is standardization. 

This refers to terminology and jargon within your particular company. Every company can create a manual of terms, glossary, thesaurus, etc. But a taxonomist or someone working with taxonomy software can refine this process and create a standard that efficiently works across the company, so everyone is in compliance. It is kind of like having a style guide, but only for key terms of the business. 

Compliance is another key benefit to all of this. Regulations need to be followed and adhered. There are other legal and regulatory impacts that information has and taxonomy, ontology and information management are a few ways to stay ahead of the mess. Information audits can be a great way to find holes in your system and develop ways to patch those holes for greater governance and compliance. All of this can save us time and money on our business operations in one way or another.

Techniques for Creating a Great Information Structure

Taxonomy and information management starts with a few basic techniques to help guide end users to information they have are trying to navigate to. The less time to navigate, the better.

Not only are terms important, but so are their relationships. Sometimes information can be found using one term or another, depending on the scenario. If you are looking for blue cars that are fast, you could search cars by color or by speed, as one example illustrates this point.

Standards should also weed out content that is irrelevant or invalid. Other types of information related to terms can be used in conjunction with taxonomy. There should be clear hierarchies of information within the enterprise as well. 

All of this data should be able to be used with other tools like content management, indexing, search and others. It should always support ANSI/NISO Z39.19 or ISO 2788 thesaurus standards. Different classes of information may apply to sets or subsets even. 

Make sure that any software you use will generate reports for you on analytics (of terms) and so forth. This is very important.

There are a variety of ontology and thesaurus options available. They are available in a multitude of platform formats. Here are a few: MultiTes Pro (Microsoft Windows), Cognatrix (Apple Mac OS X), One-2-One (Windows) and TheW32 (multi-platform). There don't appear to be many options for the mobile platforms yet (iOS, Android, BlackBerry and Windows Phone). 

The information management problem in the world of big data seems pervasive, but there is a growing trend toward developing new ways of dealing with it. Now is the time to start looking at creating a plan to develop a system for dealing with taxonomy, oncology and information management to help your organization users to access data more quickly, efficiently and sensibly. 

The more content builds up, the more the organization needs to change, adapt and most importantly, handle of the big data involved. 

Thursday, May 15, 2014

Content Categorization Role in Content Management

An ability to find content in a content management system is crucial. One of main goals of having a content management system is to make content easy to find, so you can take an action, make a business decision, do research and development work, etc.

The main challenge to findability is anticipating how users might look for information. That's where categorization comes into play. The quality of the categorization of each piece of content makes or breaks its findability. Theoretically, good tagging will last the lifetime of the content. You would think that if you do it well initially, then you can forget about it until it is time to retire that content. But reality can be very different.

Durable Categorization

Many issues complicate content categorization. They include:
  • the sheer volume, velocity, and variety of internal and external-facing content which needs management;
  • evolving/emerging regulations and compliance issues, some of which need to be retroactively applied; 
  • the need to limit the company's exposure and to support the strength of its position in any legal activity.
Some organizations face the added challenge of integrating content from acquisitions or mergers, which most likely use content management structure, categorization, and methodologies that are incompatible and of inconsistent quality.

Considering these issues, the success factor for good content categorization are the automatic categorization techniques and processes.

Traditionally, keywords, dictionaries, and thesauri are used to categorize content. This type of categorization model poses several problems:
  • taxonomy quality - it depends on the initial vision and attention to detail, and whether it has been kept current;
  • term creep - initial categorization will not always accommodate where and how the content will be used over time, or predict relevancy beyond its original focus;
  • policy evolution - it can't easily apply new or evolving policies, regulations, compliance requirements, etc.;
  • cost and complexity - it is difficult and costly, if not practically impossible, to retroactively expand the original categorization of the existing content if big amount of content is added.
Automatic Categorization

Using technology to automatically categorize content is a solution. It applies the rules more consistently than people do. It does it faster. It frees people from having to do the task, and therefore has less costs. And, it can actively or retroactively categorize batches or whole collections of documents.

You can experience these benefits by using concept-based categorization driven by an analytics engine integrated into the content management system. These systems mathematically analyze example documents you provide to calculate concepts that can be used to categorize other documents. Identifying hundreds of keywords per term, they are able to distinguish relevance that escapes keyword and other traditional taxonomy approaches. They are even highly likely to make connections that a person would miss.

Consider 3D printers as an example. These are also known as "materials printers", "fabbers", "3D fabbers", and as "additive manufacturing". If all of those terms are not in the taxonomy, then relevant documents that use one or more of them, but not 3D printer, would not be optimally categorized.

People looking for information about 3D printers who are not aware of the alternative terms would miss related documents of potential significance. This particularly impacts  external facing websites that sell products on their websites. Their business depends on fast and easy delivery of accurate and complete information to their customers, even when the customer doesn't know all of the various terms used to describe the product they are looking for.

In contrast, through example-based mathematical analysis and comparison along multiple keywords, conceptual analytics systems understand that these documents are all related. They would be automatically categorized and tagged as relevant to 3D printing.

Another difference is that taxonomy systems require someone to enter the newly developed or discovered terms. In conceptual analytics, it is simply a matter of providing additional example documents that automatically add to the system's conceptual understanding.

The days of keeping everything "just in case" are long gone. From cost and risk exposure concerns, organizations need to keep only what is necessary, particularly as the volume and variety of content continue to grow. Good categorization and tagging systems are essential to good content management and to controlling expense and exposure.

Outdated and draft documents unnecessary expand every company's content repositories. Multiple copies of the same or very similar content are scattered throughout the organization. By some estimates, these compose upwards of 20% or more of a company's content.

Efficiently weeding out that content means 20% less active and backup storage, bandwidth, cloud storage for offsite disaster recovery, and archive volume. Effective and thorough tagging can identify such elements to reduce these costs, and simultaneously reduce the company's cost and exposure related to legal or regulatory requirements.

The Value Beyond Cost Savings

An effectively managed content delivers better cost of content management and reduced exposure to risk. While this alone is reason to implement improvements in categorization, there are other reasons.

Superior categorization through conceptual analysis also affects operational efficiency by enabling fast, accurate, and complete content gathering. A significant benefit for any enterprise is that it allows more time for actual work by reducing the time it takes to find necessary information. It is of critical importance for companies whose revenue depends on their customers quickly and easily finding quality information.

Conceptual analytics systems deliver two other advantages over traditional taxonomy methods and manual categorization. It creates a mathematical index, so it is useless to anyone trying to discover private information or clues about the company. Also, it is deterministic and repeatable. It will give the same result every time and so it is very valuable in legal or regulatory activities.

Concept-based analysis makes content findable and actionable, regardless of language, by automatically categorizing it based on understanding developed from example documents you provide. Both internally and externally, the company becomes more competitive with one of its most important assets which is unstructured information.

Thursday, September 20, 2012

Faceted Search

Faceted search, also called faceted navigation or faceted browsing, is a technique for accessing information organized according to a faceted classification system, allowing users to explore a collection of information by applying multiple filters. A faceted classification system classifies each information element along multiple explicit dimensions, enabling the classifications to be accessed and ordered in multiple ways rather than in a single, pre-determined, taxonomic order.

Facets correspond to properties of the information elements. They are often derived by analysis of the text of an item using entity extraction techniques or from pre-existing fields in a database such as author, descriptor, language, and format. Thus, existing web-pages, product descriptions or online collections of articles can be augmented with navigational facets.

Faceted search has become the de facto standard for e-commerce and product-related web sites. Other content-heavy sites also use faceted search. It has become very popular and users are getting used to it and even expect it.

Faceted search lets users refine or navigate a collection of information by using a number of discrete attributes – the so-called facets. A facet represents a specific perspective on content that is typically clearly bounded and mutually exclusive. The values within a facet can be a flat list that allows only one choice (e.g. a list of possible shoe sizes) or a hierarchical list that allow you to drill-down through multiple levels (e.g. product types, Computers > Laptops). The combination of all facets and values are often called a faceted taxonomy. These faceted values can be added directly to content as metadata or extracted automatically using text mining software.

For example, a recipe site using faceted search can allow users to decide how they’d like to navigate to a specific recipe, offering multiple entry points and successive refinements.

As users combine facet values, the search engine is really launching a new search based on the selected values, which allows the users to see how many documents are left in the set corresponding to each remaining facet choice. So while users think they are navigating a site, they are really doing the dreaded advanced search.

There are best practices in establishing facets. They are:

do not create too many facets - presenting users with 20 different facets will overwhelm them; users will generally not scroll too far down beyond the initial screen to locate your more obscure facets;

base facets on key use cases and known user access patterns - idenfity key ways users search and navigate your site. Analysing search logs, evaluating competitor sites, and user research and testing are great ways to figure out what key access points users are looking for. Interviewing as few as 10 users will often give you great insight into what the facet structure should be;

order facets and values based on importance - not all facets are equally important. Some access points are more important than others depending on what users are doing and where they are in the site. Present most popular facets on the top. When determining order for navigation, again think about your users and why they are coming to your site.

leverage the tool to show and hide facets and values - while the free or low-cost faceted search tools don’t all offer these configuration options, more sophisticated faceted search solutions allow you to create rules to progressively disclose facets.

Think of a site offering online greeting cards. While the visual theme of the card – teddy bears, a sunset, golf – might eventually be important to a user, it probably isn’t the first place they will start their search. They will likely start with occasion (birthday, Christmas), or recipient (father, friend), and then become interested in themes further down the line. Accordingly, we might hide the “themes” facet until a user has selected an occasion or recipient. You can selectively present facets based on your understanding of your users and their typical search patterns (as mentioned in the previous “do”).

Also take advantage of the search engine’s clutter-reducing features, such as the “more...” link. This allows you to present only the most popular items and hide the rest until the user specifically requests to see them. You can also do this at the facet level, collapsing lesser-used facets to present just the category name and let users who are interested expand that facet.

facet display should be dependent on the area of the site. If you are in the first few layers of your site, you should show fewer facets with more values exposed, whereas if you are deeper into product information you should show more facets, some with values exposed and others hidden.

create your taxonomy with faceted search in mind - a good taxonomy goes a long way in making a successful faceted search interface.

There are some important guidelines to follow in taxonomy design. Facets need to be well defined, mutually exclusive and have clear labels. For example, having one facet called “Training” and another “Events” is confusing: where do you put a seminar? Is it training or an event? If you have to wonder, your users will too. The taxonomy depth (how many levels deep does it go) and breadth (how many facets wide is it) are other important considerations. Faceted search works better with a broad taxonomy that is relatively shallow, as this lets users combine more perspectives rather than get stuck in an eternal drill down, which causes fatigue. The facet configuration and display rules will help you create the optimal progressive presentation of these facets so as to not overwhelm users with the breadth.

If you are torn between two places in the taxonomy for a term, consider putting it in both places. This is called polyhierarchy, and it is a good way to ensure findability from multiple perspectives. Polyhierarchy is best served within a facet rather than across multiple facets. Since facets should be mutually exclusive, you shouldn’t have much need to repeat terms across facets, which can be more confusing than helpful.

The most important thing however, is to be prepared to break any of these rules in the name of usability. Building a faceted taxonomy involves understanding your users’ search behavior.

As the trend towards increased social computing continues, Web 2.0 concepts are entering the realm of faceted search. We are starting to see social tags being used in faceted search and browse interfaces. Buzzillions.com, a product-review site, is using social tag-based facets in its navigation, allowing users to refine results based on tags grouped as "Pros" or "Cons". This site uses a nice blend of free social tagging and control to ensure good user experience; when you type in a tag to add to a product review, type-ahead verifies existing tags and prompts you to select one from the existing list of matches to maximize consistency.

Ultimately, navigation and search is one of the main interactions users have with your site, so getting it right is not just a matter of good design, it impacts the bottom line. Faceted search is a very popular and powerful solution when done well; it allows users to deconstruct a large set of results into bite-size pieces and navigate based on what’s important to them. But faceted search by itself is not necessarily going to make your users lives easier. You need to understand your users’ mental models (how they seek information), test your assumptions about how they will interpret your terms and categories and spend time refining your approach.

Faceted search can just add more complexity and frustrate your users if not considered from the user perspective and carefully thought through with sound usability principles in mind. Faceted search is raising the bar in terms of findability and how well you execute will determine whether your site meets the new standard.

Wednesday, July 25, 2012

Automatic Classification

In my previous posts, I mentioned that the taxonomy is necessary to create navigation to content. If users know what they are looking for, they are going to search. If they don't know what they are looking for, they will look for ways to navigate to content, in other words, browse through content. Taxonomies can also be used as a method of filtering search results so that results are restricted to a selected node on the hierarchy.

Once documents have been classified, users can browse the document collection, using an expanding tree-view to represent the taxonomy structure.

When there are many documents involved, creating taxonomy could be time consuming. There are few tools on the market that provide automatic classification. Another use of the automatic classification is to automatically tag content with controlled metadata (also known as Automatic Metadata Tagging) to increase the quality of the search results.

The tools that provide automatic classification are: Autonomy, ClearForest, Documentum, Interwoven, Inxight, Moxomine, Open Text, Oracle, SmartLogic.

These tools can classify any type of text documents. Classification is either performed on a document repository or on a stream of incoming documents.

Here is how this software works. Example: "International Business Machines today announced that it would acquire Widget, Inc. A spokesperson for IBM said: "Big Blue will move quickly to ensure a speedy transition".

The software classifies concepts rather than words. Words are first stemmed, that is they are reduced to their root form. Next, stop words are being eliminated. These include words such as a, an, in, the - words that add little semantic information. Then, words with similar meanings are equated using thesaurus. For example, the words IBM, International Business Machines, and Big Blue are treated as equivalent.

Next, the software will use statistical or language processing techniques to identify noun phrases or concepts such as "red bicycle". Further, using thesaurus, these phrases are reduced to distinct concepts that will be associated with the document. In this example, there are 3 instances of IBM, 2 instances of acquisition (acquire, speedy transition), and 1 instance of Widget, Inc.

Approaches to Classification

Manual - requires individuals to assign each document to one or more categories. It can achieve a high degree of accuracy. However, it is labor intensive and therefore are more costly than automatic classification in the long run.

Rule-based - keywords or Boolean expressions are used to categorize a document. This is typically used when a few words can adequately describe a category. For example, if a collection of medical papers is to be classified according to a disease together with its scientific, common, and alternative names can be used to define the keywords for each category.

Supervised Learning - most approaches to automatic classification require a human expert to initiate a learning process by manually classifying or assigning a number of "training documents" to each category. This classification system first analyzes the statistical occurrences of each concept in the example documents and then constructs a model or "classifier for each category that is used to classify subsequent documents automatically. The system refines its model, in a sense "learning" the categories as documents are processed.

Unsupervised Learning - these systems identify both groups or clusters of related documents as well as the relationship between these clusters. Commonly referred as clustering, this approach eliminates the need for training sets because it does not require a preexisting taxonomy or category structure. However, clustering algorithms are not always good at selecting categories that are intuitive to users. On the other hand, clustering will often expose useful relationships and themes implicit in the collection that might be missed by a manual process. For this reasons, clustering generally works hand-in-hand with supervised learning techniques.

Each of approaches is optimal for a different situation. As a result, classification vendors are moving to support multiple methods.

Most real world implementations combine search, classification, and other techniques such as identifying similar documents to provide a complete information retrieval solution. Organizations having document repositories will generally benefit from a customized taxonomy.

Once documents are clustered, an administrator can first rearrange, expand or collapse the auto-suggested clusters or categories, and then give them intuitive names. The documents in the cluster serve as initial training sets for supervised-learning algorithms that will be used subsequently to refine the categories. The end result is a taxonomy and a set of topic models are fully customized for an organization's needs.

Building an extensive custom taxonomy can be a large expense. However, automated classification tools can reduce the taxonomy development and maintenance cost.

Organizations with document collections that span complex areas such as medicine, biotechnology, aerospace will have a large taxonomy. However, there are ways to refine taxonomy so it does not become an overwhelming task.

Together, enterprise search and classification provide an initial response to information overload.

Tuesday, June 12, 2012

Taxonomy and CMS

Any information system should have two access points - search and browse. When users know exactly what they are looking for, they are going to use search. If you have enabled metadata search in your system, this search is going to be precise and will retrieve documents that users are looking for.

If users do not know what they are looking for, they are going to use browse to navigate to documents. Somewhere, some time during their browsing they may switch to search and then back to browsing.

In order to enable browsing or navigation in your system, you must create taxonomy and organize your documents according to this taxonomy.

But how do you apply the taxonomy that you created to your content management system (CMS)? Each CMS has a hierarchical structure. For example, SharePoint has the following structure: site collection --> site --> sub-site (optional) --> library --> folder, Vasont has collection group --> collection --> content type. And so each CMS has a hierarchical structure which could be adopted to your taxonomy.

Let's look at a specific example. Your taxonomy may look something like this: department --> unit --> content type --> subject --> date.

If we take SharePoint as the CMS you use, then: department = site collection; unit = site; content type = library; date = folder (for some content types) or subject = folder (for other content types).

In other words, each taxonomy unit is the same as a unit in the hierarchical structure of your CMS.

So, our example within CMS would look like this:

Engineering Department Site Collection --> Electrical Engineering Site --> Drawings Library --> Building Electrical Wiring Folder

or

Engineering Department Site Collection --> Electrical Engineering Site --> White Papers Library --> 2012 Folder

So, if somebody tells you that a CMS does not have a functionality to create taxonomy, ask them what is the hierarchical structure of this CMS and adopt this structure to your taxonomy.

Monday, June 11, 2012

Taxonomy and Controlled Vocabulary


A taxonomy is an organizing principle. It is a foundation on which to base any kind of system. It does not matter what kind of project you are involved in, it will benefit from clearly defined, concise language and terminology. A taxonomy and controlled vocabulary help to fine tune search tools, they creates a common language for sharing concepts, and it allows an efficient organization of documents and content across information sources.

Whether a structured tool such as a CRM system, or a less structured one, like a content management system that organizes information for web sites or intranets, all technologies that deal with information require a basis in taxonomy. This is even more important when various systems must interact.

Taxonomies use controlled vocabularies. For example, the issue of language: I call the person I do business with a Customer. Someone else calls them a Client. When we need to exchange or combine or analyze data which entity are we talking about? What is the document that outlines what we are providing, is it a statement of work, a proposal, an SOW or something else? Controlled vocabulary helps to make terms consistent.

When employees search for information, do they use language that is unambiguous? Can this information be easily found and re-purposed? Are employees sure they are not recreating information that already exists?

These are important questions, but there are larger issues that can have an even greater impact on the organization. Are all of these challenges of business going to be magically solved with a taxonomy? Of course not, but if the underlying structure is not in place, then essential tools, technologies and processes will not function together. Connecting system A to system B makes little sense when a common language has not been established to have information make sense in the new context.

Business Problem

Consider what happens if each department does their job, but accounting people spoke British English, IT spoke a Cajun dialect, legal an inner city slang, and business people spoke the language of scientific researchers. For all practical purposes, the languages they use in communicating with their professional peers are as different as these corners of the English language. In order for documents and pieces of content to be reusable and understandable in all of these different contexts and for these different audiences we need to develop a Rosetta stone of the enterprise. That is an enterprise taxonomy and controlled vocabulary.

Some people think that this is an insurmountable task – getting people to agree on common terms and meanings. Language is too ambiguous and variable, needs are too diverse to be able to develop a common denominator of communication for all circumstances. Instead we create a structure for defining and applying terms and for managing change. The alternative is uncontrolled and chaotic. But too much control is impractical. Determining where to control and centralize and where to allow variability is part of the process of developing and implementing an enterprise taxonomy and controlled vocabulary.

Enterprise Search

There is a prevalent opinion that a Google-like search interface is the answer to the search problem. There are many reasons why this is not true. One is that in a company, many of the clues that Google uses to deliver results are missing. Google will use links between sites to determine how to rank results. If lots of other sites point to a document then that document is deemed to be more valuable. In the corporate intranet, there is no equivalent way of ranking results.

Another fundamental flaw with pure search solutions is that meaning, value, and applicability are context dependent. The usefulness of a piece of content is in the eye of the beholder. A document is useful to a person if this person can use it to solve a problem. This depends upon this person role, task, and background.

A search engine cannot determine these factors and present results based on this person's needs. However, if you perform some process analysis in order to understand a user’s tasks and how they go about solving their problem, you can present information in anticipation of their needs. The role of a taxonomy and controlled vocabulary is to define the labels that correspond to user tasks, experience, needs, and context that helps to refine their search or guide their navigation.

Leveraging Taxonomies

Part of the analysis phase in taxonomy development is to understand what users are trying to accomplish, and then present a set of documents that users should look at when they are performing these tasks. For example, a sales person may be preparing a proposal for a customer. If he/she searches in a large repository for documents, he/she will likely pull up a lot of documents that may contain the term "proposal", but they may not be example proposals that he/she can use.

On the other hand, if this sales person defines the business development function as including proposal creation, he can find sample proposals that will be useful. You can define a tag called "sample proposal" or some other label that we agree will designate documents that can be useful for this purpose.

You may want to go further and define the specific industry, the product or service offering, the size of the deal and so on. By carefully defining labels for the documents you can search based on these labels or navigate to a place where these documents reside. These results will be precisely for your task at hand and will save you from creating a proposal from scratch or from endless searching for relevant documents.

So in the first case search using "proposal" retrieved perhaps hundreds of documents containing the term proposal. In the second case the search contains a smaller subset of documents that more closely meet your criteria.

Imagine that in one repository you refer to proposals for customer service outsourcing as "service outsourcing" and in another repository, you refer to it as "business process outsourcing". If you search on one term, you really also want the documents with the other term. These terms are synonymous. You could make a note of terms that may be used interchangeably and apply a synonym ring to the search mechanism, enabling search on one term to return documents containing the other terms.

Navigation

As we just observed, search is one area where taxonomies can be leveraged. What about navigation also called browsing? Some people equate taxonomy with navigation. Taxonomy makes navigation possible. By understanding the underlying structure of information and how people access that information, you can propose a structure by which users can click through the content. Navigational structures directly reflect the taxonomy. For example, if you organize content according to departments or functional areas, with geographies comprising navigational nodes, this would be your taxonomy. In other cases, users may navigate according to a task or business process that could start out with a geography and then move to a task, such as customer service.

Taxonomy Development and Maintenance

Taxonomy and controlled vocabulary development and maintenance is an ongoing process. This is very important process. It is essential that we agree on terminology in order to integrate, collaborate, and communicate most effectively. Not addressing this issue will lead to more problems of information overload, difficulties in integrating systems and inefficiencies in the organization.

The short term goal should be to educate your organization on these issues, medium term - to begin the process of formalizing sharing and application of consistent language across systems and processes, long term - the goal would be to develop a mature process for ongoing maintenance and governance of enterprise taxonomies. It is important to start the process now, rather than wait for search, navigation, and access of information to become a big problem.