During last ten years the volume and diversity of digital content grew at unprecedented rates. There is an increased use of departmental network drives, collaboration tools, content management systems, messaging systems with file attachments, corporate blogs and wikis, and databases. There are duplicate and untraceable documents that crowd valuable information needed to get work done.
Unfortunately, not all content makes into it into a managed content repository, like a portal or a content management system. Some companies have more than one content management system. Having a search solution that could search across all content repositories becomes very important.
Expectations for quality search continue to rise. Many users like to use an expression: "we would like a search like Google". So, how do we formulate a search strategy?
Here are few key points:
- Security within enterprise search strategies should be carefully designed. Information like employee pay rates, financial information, or confidential communications should not end up in a general search results.
- Search results should deliver high quality, authoritative, up-to-date information. Obsolete information should not end up in the search results.
- Search results should be highly relevant to keywords entered in a search box.
- The ability to limit the search should be included.
Steps to Develop an Enterprise Search Strategy
Step 1: Define Specific Objectives for Your Search Strategy
People don’t search for the sake of searching. They search because they are looking to find and use information to get their jobs done. Answer these questions:
1. Who is searching? Which roles within the organization are using the search function, and what requirements do they have?
For example, a corporate librarian is likely familiar with Boolean search and using advanced search forms, while a layperson searcher likely prefers a simple search box. A sales professional may need an instant access to past proposals for an upcoming meeting, but compliance professionals conducting investigations often use deep search across massive message archiving and records management systems.
2. What categories of information are they looking for?
Define the big buckets of information that are the most relevant to different roles. Realize that not all roles need all information. Part of why desktop search tools are popular is they inherently define a bucket called "stuff on my machine". Defining categories for searching project information, employee information, sales tools, and news helps searchers formulate the right query for the right type of search.
3. What are they likely to do with the information when they find it? After defining broad information categories, work to understand context and answer the question: why are people searching?
For example, if a marketer is collecting information on a particular competitor by searching on the company’s name, it is often useful to expand that query to include related information, like other competitors in the industry, specific business units or product lines, pricing information, past financial performance. Related information can be included in search experiences through a variety of methods, including the search results themselves or methods like faceted navigation.
It is impossible to account for every type of information that users may be looking for, but defining broad user roles, like sales professionals or market researchers and identifying their most common search scenarios is a great way to create the scope of a search project. Use such methods as personas, use cases, interview users to validate assumptions about what processes they are involved in, and identify the information that is most useful to support those processes.
Step 2: Define the Desired Scope and Inventory Repositories
When using the search function built into a particular content management system, the product itself limits the scope of the search to whatever is stored in this system. Search engines such as Autonomy, Endeca Technologies, Google, Vivisimo, and others will search across multiple content management systems and databases. Increasingly, portal products and collaboration platforms from companies like IBM, Microsoft, Oracle, and Open Text will also let you search content that is stored inside and outside of their systems.
Use search to reach outside the confines of a single repository. Cross-repository search becomes essential when companies use different content repositories for different purposes.
Match roles and search categories to relevant content sources. Search requirements often include multiple repositories, such as document libraries, file systems, databases, etc. These repositories usually consist of multiple technology products, such as Lotus Notes, EMC Documentum, Microsoft SharePoint, and others. Using the roles and types of searches you are looking to support, identify all of the relevant repositories necessary to achieve your desired search scope.
Create an inventory of required repositories. When creating your inventory, document the name of each repository, a repository owner, a description of its content, an assessment of the quality of this content, and the quantity and rate of growth of content in each repository. Also document the technology product used as well as any specific security access policies in place.
Consider a phased rollout and select simple but telling data source repositories for kick-off. When rolling out a project such as search strategy that involves disparate sources and complex UIs, a phased rollout may be preferable depending upon factors such as resource constraints and time-to-launch pressure. By approaching the project in phases, you can vet the process and workflow while familiarizing users with the objectives.
Inventory and prioritize the repositories at the start of your project so that you can identify and start with the repositories that will have a big impact. For example, basic queries into a CRM system can add a lot of value while remaining relatively straightforward. Throughout this process, it is important to set expectations with your users, since this approach may lengthen their involvement with the project.
Documenting your repositories lets software vendors effectively size and bid on your project. Most search software gets priced based on the number of documents (or data items) in the index plus additional fees for premium connectors that ingest content from repositories like enterprise content management systems.
For example, strategies that require a limited set of commodity connectors are priced altogether differently than those with premium connectors for content management systems and enterprise applications. Thus, knowing which repositories are relevant and understanding the rate of content growth within them can help avoid unnecessary overspending.
Step 3: Evaluate and Select the Best Method for Enriching Content
When addressing content with very little descriptive text and metadata, evaluate several methods for enriching the content to improve the search experience. Methods range from manual application of metadata to automatic categorization. Some companies use a mix of both methods.
Step 4: Define Requirements and List Products and Vendors to Consider
After specifying a search scope, define requirements for users. The most important is not to get distracted with irrelevant features, but instead to focus on products that adequately meet the organization’s requirements over a specified time period. Consider factors like ease of implementation, product strategy, and market presence in any product evaluation.
Score and select vendors on criteria that are relevant for your needs. There are many vendors to choose from. Search vendors include Autonomy, Coveo Solutions, Endeca Technologies, Exalead, Google Enterprise, ISYS Search Software, Recommind, Thunderstone Software, Vivisimo, and others. Also large software providers such as IBM, Microsoft, Oracle, and SAP have one or more search products on the market.
Product capabilities range from highly sophisticated, large-scale, secure searches that mix advanced navigation and filtering, to basic keyword searches across file systems. Products differ depending on whether the content being searched consists primarily of data. For example, high-end search companies like Endeca offer robust tools for searching structured data from databases, while small-scale basic file system search needs can be met with products like the Google Mini or the IBM OmniFind Yahoo! Edition.
Step 5: Define a Taxonomy of Logical Types of Searches
While it is impossible to predict and account for everything people search for, it is possible to organize the search experience so it is intuitive to use. Start with defining logical types of searches. For example:
People Search. Searching for employees has gained acceptance as a valuable type of search within enterprises for finding expertise on a subject. A search for people, whether it is a simple name look-up or more advanced expertise search, requires attention to everything from how the query gets processed to how results appear in the interface. For example, searchers typically want to see an alphabetical list of names in a people search results as opposed to results ranked by relevance.
Product Search. A search for products frequently needs to include product brand names (e.g., Trek), concepts and terms related to the product (e.g., bike, bicycle, road race, touring), product description, and specific product attributes, like frame size, material, and color. Knowing where all of this information is stored and how it should be optimally presented to end users is essential.
Customer Search. It is now possible to search and return results for virtually any logical item in an enterprise, like orders, customers, products, and places. You should look into sources like enterprise data warehouses, ERP systems, order histories, and others to create a full picture of the items that is being searched.
Documents Search. Documents usually reside in few repositories, so be sure to include them in your search sources. Users expect search results to be highly relevant with most relevant to be on the top of the search results list.
By bucketing types of searches into logical categories, you can also improve the quality of those searches. Several methods include applying type specific thesaurus, taxonomies, and controlled vocabularies.
Administrators can influence the relevance algorithm in a way that returns the right information the right way, like weighting hits in a product description more heavily than a product attribute field.
Step 6: Plan for a Relevant User Experience
Recognize that not all search experiences should be the same. Google, Yahoo!, and MSN’s popularity on the Web have generated strong interest in offering simple-to-use wide search boxes and tabbed interfaces within the enterprise. But in the enterprise, it is often helpful to use more advanced interface techniques to clarify what users are looking for, including:
Faceted navigation adds precision to search. It exposes attributes of the items returned to an end user directly into the interface. For example, a search through a product information database for "electrical cables" might return cables organized by gauge, casing materials, insulation, color, and length, giving an engineer clues to find exactly what he is looking for.
Statistical clustering methods remove ambiguity. Methods like statistical clustering automatically organize search results by frequently occurring concepts. Clusters provide higher level groupings of information than the individual results can provide, and can make lists of millions of documents easier to scan and navigate.
Best bets guide users to specific information they need. Creating best bets is the process of writing a specific rule that says something like: "when a person enters the term "401K plan" into the search box on the corporate intranet, they should see a link to the "401K plan" page on the intranet".
Additionally, products like Google OneBox and SAP’s Enterpise Search Appliance enable retrieval of frequently searched facts, such as sales forecast data, dashboards, and partner information from back-end ERP systems. Best bets help users avoid a lot of irrelevant results and are very effective for frequently executed queries.
Use basic interface mock-ups and pilot efforts to test, refine, and make these concepts useful for employees in your organization. Many companies use a "Google Labs" style page on their intranets to test out search user interface concepts and tools prior to exposing them more broadly to the enterprise.
Step 7: Implement, Monitor, and Improve
For large projects, allow a lot of time for change management. Teams should maintain the interface between the search engine and all of its back-end content sources.
It is essential to keep IT individuals informed of product evaluation and selection plans so that the final implementation supports security and regulatory policies that are in place for these systems.
Create a plan for ongoing maintenance of search indexing processes and exceptions. Create a monthly reporting plan that lists most frequent searches performed, searches that did not retrieve results, and overall usage of the search function. This can help you troubleshoot existing implementations and drive future decisions on how to enhance the search experience over time.
Enhancements typically include adding types of searches to the experience, further enriching content assets for better retrieval, and incorporating new, valuable content into the overall experience.
In my future posts, I will describe search products such as Autonomy, Coveo Solutions, Endeca Technologies, Exalead, ISYS Search Software, Recommind, Thunderstone Software, Vivisimo, and others.