Galaxy Consulting Blog

Monday, October 28, 2013

Meeting the Social Media Challenge

When social media volume is low, it is typically handled manually by one or more people in a company. These people are assigned to check Facebook and/or Twitter a couple of times a day and respond when appropriate.

As the volume of inquiries grows, it becomes expensive to respond manually to the posts and comments, and nearly impossible to do it on a timely basis. After a while, it becomes clear that automation is necessary to respond to the large number of social media comments in appropriate time frames.

During the next few years, organizations of all sizes will need to build a social media technology servicing framework to handle an increasing volume of inquiries, complaints, and comments. As social media is conceptually just another channel, it should be incorporated into the enterprise's overall servicing framework. However, the unique characteristics and demands of social media interactions require specialized solutions and processes, even though the responses should be consistent in all channels.

There are many applications to help organizations handle their social media servicing challenges, and new ones are constantly being introduced. However, currently, there is no single solution that addresses all necessary requirements. Enterprises that want a complete solution need to purchase several applications and integrate them. They should also merge these applications with their existing servicing infrastructure to ensure an excellent customer experience.

The underlying technical components required to build a social media servicing infrastructure are:

Tools for monitoring social media sites for brand and company mentions.
Data acquisition/capture tools to identify and gather relevant social media interactions for the company.
Data extraction tools that separate "noise" from interactions that require immediate or timely responses.
An engine for defining business rules that generates alerts, messages, pop-ups, alarms, and events.
Integration tools to facilitate application-to-application communication, typically using open protocols such as Web services. Prebuilt integration tools, along with published application programming interfaces, should be provided for contact center applications.
Storage to house and access large volumes of historical data, and an automated process to retain and purge both online and archived data. Additional capabilities may include the ability to access archived data via other media, such as a CD-ROM, and the ability to store and retrieve data in a corporate storage facility, such as a network-attached storage or storage area network.
Database software for managing large volumes of information.
Work flow tools to automate business processes by systematically passing information, documents, tasks, notifications, or alerts to another business process (or person) for additional or supplementary action, follow-up, or expertise.

The core administrative tools needed are:

User administration capability with prebuilt tools to facilitate system access, user set-up, user identification and rights (privileges), password administration, and security.
Alert management capability that allows thresholds to be set so that alarms, alerts, or notifications can be enabled when predefined levels or time frames are triggered when violations or achievements occur (examples include alerts to signal changes in topics, emerging issues, and sentiment).
Metrics management, including the ability to enter, create, and define key performance indicators (KPIs) and associated metrics.
System configuration with an integrated environment for managing application set-up, and parameters for contact routing, skill groups, business rules, etc.

The core servicing functionality includes:

Skills-based routing tools to deliver identified interactions to agents or other employees with the proficiency to address them.
The ability to queue and route transactions (calls, emails, chat/IM, and social media posts) to the appropriate agent, employee, or team.
Text analytics software that uses a combination of statistical or linguistic modeling methods to extract information from unstructured textual data.
Filtering tools that separate "noise" from social media customer interactions that require immediate or timely responses.
Topic categorization software that identifies themes and trends within social media interactions.
Root cause analysis, a problem-solving tool that enables users to strip away layers of symptoms to identify the underlying reasons for problems or issues.
Search and retrieval abilities that allow large volumes of data to be searched, based on user-defined queries, to retrieve specific instances.
Sentiment analysis capability that can identify positive or negative sentiment about a company, person, or product, and assign a numerical score based on linguistic and statistical analysis.
A social CRM servicing solution that logs and tracks received social media interactions so that agents or employees can view the post/comment, create a customized response, and issue or post it.
Response templates that comprise a library of customizable responses to common social media posts.
A social media publishing tool that enables users to publish posts to social media sites.
Reporting functionality in which reports can be set up based on collected data, metrics, or KPIs in a preferred presentation format (chart or graph); this should also include the ability to create custom reports based on ad hoc queries.
Scorecards/dashboards for all constituents in an organization - agents, supervisors, managers, other departments, and executives.
An analytics tool that conducts multidimensional analyses of social media data, used to look for trends and data relationships over time, identify emerging issues and root causes, etc.
Recording software to capture social media inputs and responses.

Organizations also need a number of management applications to ensure that their social media teams or departments are properly trained and staffed. These tools are:

Quality assurance functionality to measure the quality of social media comments and posts by agents, to ensure that they are adhering to the organization's guidelines.
Coaching and e-learning software to deliver appropriate training courses and best practice clips to agents and other employees involved in responding to social media interactions.
A workforce management solution to forecast the expected volume of social media interactions that will require agent/employee assistance, and to identify and create optimal schedules (this also tracks adherence to service levels for each inquiry type).
Surveying software to determine if customers/comments were satisfied with the company's responses.
Desktop analytics to provide an automated and systematic approach to monitor, capture, structure, analyze, report, and react to all agent/employee desktop activity and process workflows.
An analytics-oriented performance management module that creates scorecards and dashboards to help contact center and other managers measure performance against preset goals.

Social media is going to change the servicing landscape for many organizations within the next five to eight years. This is because the volume of social media comments and posts is expected to grow rapidly, comprising 50 percent of all service interactions. Companies that build a servicing strategy incorporating social media will have a major advantage over their competitors.

Companies do not need all of the solutions identified above, they need to select the ones that allow them to incorporate social media into their servicing strategy and infrastructure so that customers can interact with them in their preferred channel.

Saturday, October 5, 2013

Knowledge Management Adoption Through Gamification

One of the most important components of a successful knowledge management program is its ability to promote and support a culture of collaboration and knowledge sharing.

Tools, processes and organizational policies are important elements but they will only get you so far. Culture is the cornerstone that will determine the willingness of your employees to participate in knowledge management.

How do you influence employees in your organization to adopt productive behaviors around collaboration and knowledge sharing? The answer may be found in a new concept called gamification.

What is gamification? It is a new and rapidly evolving area, but the following description is a good starting point: gamification is the use of game elements and game design techniques in non-game contexts.

That definition of gamification contains three distinct elements:

Game elements - this is about leveraging the components, design patterns, and feedback mechanisms that you would typically find in video games, such as points, badges and leader-boards. It is sometimes referred to as the engineering side of gamification.

Game design techniques - this is the artistic, experimental side of gamification. It includes aesthetics, narrative, player journey, progression, surprise, and, of course, fun. Games are not just a collection of elements, they are a way of thinking about and approaching challenges like a games designer.

Non-game contexts - some common areas in which gamification has taken hold include health and wellness, education, sustainability, and collaboration and knowledge sharing in the enterprise.

There are three key types of knowledge management behavior:

connect: how people connect to the content and communities they need to do their job;
contribute: the level at which people are contributing their knowledge and the impact of those contributions on other people;
cultivate: the willingness to interact with and build upon the ideas and perspectives of other employees, to help nurture a spirit of collaboration.

The unique selling point of gamification is the potential to learn from games and to draw on what makes games so engaging and attractive and to apply those components in other contexts. What is behind this philosophy? While people can be drawn in to collaborate and share via extrinsic motivation, the more you can tap into their intrinsic motivations and help people realize the inherent benefits of collaboration, the more successful and sustained that engagement will be.

We can identify three ways to affect intrinsic motivation: mastery, autonomy, and purpose.

Mastery

Getting really good at something, be it a skill, sport or mental discipline, has its own benefits. The goal of gamifying collaboration is to help people get good at it and, therefore, realize its inherent benefits. As participants progress through the "game", they gradually learn the skills to find expertise, build their network, and share their knowledge in a way that makes them more effective, and advances their careers.

Autonomy

Autonomy is about giving people the freedom to make meaningful choices. Instead of dictating a prescribed path, an autonomous approach allows them to set their own goals, choosing how they wish to collaborate, and ultimately providing a sense of ownership. The more individuals feel that they are in control, the better engaged they are going to be. Participants can share a document, write a blog, post a microblog or create a video. It is about giving participants choices, equipping them with the tools, and rewarding them for their knowledge sharing behaviors regardless of the specific mechanism they used.

Purpose

While there are plenty of personal benefits to collaboration, people are more engaged when they feel socially connected to others as part of a larger purpose. As part of that wider organization, they can take pride in the fact that they are making a broader impact on their organization and collaboration is a key part of that experience.

The use of gamification assumes that you already have knowledge management program in place. Assuming gamification can magically transform absence of knowledge management program into something engaging is a common error. A well thought-out and sustainable approach to gamification offers significant potential to make collaboration fun and engaging.

Gamification Tips

Don't lose sight of your objectives

Start with your business objectives in terms of their outcomes and keep your eyes on those objectives and validate them as you design, develop and implement your knowledge management program.

Focus on behaviors, not activities

It is very easy to get caught up in focusing exclusively on activities and end up having people busy doing "stuff". Similar to objectives, keep a focus on the behaviors you want your people to adopt and identify activities that are indicators of those behaviors.

Data is king

You need to be able to capture, store and retrieve data. Without a way to quantify and measure it, you will be stuck in the first step.

Spread the recognition

Don't limit the number of people who can be recognized through your program. In addition, recognize people's efforts in a variety of meaningful ways. Some examples of recognition are:

e-cards with 100 recognition points (monetary value of $100);
thank-you notes from leadership;
shout-outs in internal corporate communications;
badges on employees' profile pages;
feedback during the employee's performance review process.

People will game the system

You will need to pay attention to people who want to "game" the system. Where possible, build in approaches to limit the ability of people to do so.

Start small and evolve

Gamifying collaboration is not just something you build at once. To arrive at a good and sustainable knowledge management program, you need to be iterative, creating rough versions and play-testing continuously.

No silver bullet exists

Gamification is not a silver bullet. All the available evidence suggests that it can be leveraged further to embed the collaborative behaviors that go to make up a meaningful culture of collaboration and knowledge sharing across any organization.

Wednesday, September 18, 2013

The Mystery of How Enterprise Search Works (For You!)

Enterprise search starts with a user looking for information and submitting a search query. A search query would be a list of keywords (terms) or a phrase. The search engine would look for all records that match the request and return a list to the user. The list would contain results that are ranked in order of most relevant to least relevant for the request.

Let's look at search in more detail.

Performance Measures

There are two performance measures for evaluating the quality of query results: precision and recall.

Precision refers to the fraction of relevant documents from all documents retrieved. Recall is the fraction of relevant documents retrieved by a search from the total number of all relevant documents in the collection. It is said that precision is a measure of usefulness of a result while recall is a measure of the completeness of the result.

Modern search engines provide a high recall with good precision. It is easy to achieve high recall by simply returning all documents in the collection for every query. However, the precision in this case would be poor. A key challenge is how to increase precision without sacrificing recall. For example, most web search engines today provide reasonably good recall but poor precision. In other words, a user gets some relevant results, usually in the first 10 to 20 results, along with many non-relevant results.

Relevancy

Relevancy is a numerical score assigned to a search result representing how well the result meets the information the user who submitted the query is looking for. Relevancy is therefore a subjective measure of the quality of the results as defined by the user. The higher the score, the higher the relevance.

For every document in a result, a search engine calculates and assigns a relevancy score. TF-IDF is the standard relevancy heuristic used for all search engines.It compares TF and IDF variables to provide a ranking score for each document.

TF stands for Term Frequency. This is the number of times a word (or term) appears in a single document as percentage of total number of terms in the document. Term frequency assumes that when evaluating two documents, document A and document B, the one that contains more occurrences of the search term is probably also more "relevant" to the user.

IDF stands for Inverse Document Frequency. This is a measure of the general importance of the term which is the ratio of all documents in the set to the documents that contain the term. IDF prevents a bias towards longer documents.

Additional techniques may put more emphasis other attributes to determine relevancy, for example, freshness - when was the document created or last updated or what part of the document matched the term - document title or author may score higher than finding the term in the text body.

Modern search engines provide good relevancy scoring across a wide range of document formats, but more importantly, allow users to create and use their own relevancy scoring profiles optimized for their queries. These user-defined weights, also called boosting, can be set up and run for a user, group of users, or per query. This is extremely helpful for personalizing the search experience by roles or departments within the organization.

Linguistics

Linguistics is a vital component of any search solution. It refers to the processing and understanding of text in unstructured documents or text fields. There are two parts to linguistics: syntax and semantics.

Syntax is about breaking text into words and numbers which is also called tokenization. Semantics is the process of finding the meaning behind text, from the levels of words and phrases to the level of paragraphs, a document or a set of documents. Semantic analysis often involves grammatical description and deconstruction, morphology, phonology, and pragmatics. One major challenge is ambiguity of language.

Linguistics therefore improves relevancy and affects precision and recall. Common linguistic features in a search solution include stemming and lemmatization of words (reducing words to their root or stem form), phrasing (the recognition and grouping of idioms), removal of stop words (words that appear often in documents but contain little meaning, for example articles), spelling corrections, etc.

Navigation

One way to overcome the challenges of semantics and language ambiguity used by search engines is navigation. In this case, the search engine is using linguistics features, such as extraction of entities (nouns and noun phrases, places, people, concepts, etc.) and predefined taxonomy to narrow the results by clustering related documents together or providing useful dimensions, called facets, to slice the data, for example using price, name, etc. to narrow down the search results.

The Search Index

At the heart of every search engine is the search index. An index is a searchable catalog of documents created by the search engine. The search engine receives content for all source system to place in the index. This process is called ingestion. The search engine then accepts search queries to match against the index. The index is used to quickly find relevant documents for a search query out of collection of documents.

A common index structure is the inverted index which maps every term in the collection to all of its locations in this collection. For example, a search for the term "A" would check the entry for "A" in the index that contains links to all the documents that include "A".

Wednesday, July 31, 2013

ISO 9001 and Documentation

ISO 9001 compliance becomes increasingly important in regulated industries. How does it affect documentation? Here is how...

What is Document Control?

Document control means that the right persons have the current version of the documents they need, while unauthorized persons are prevented from use.

We all handle many documents every day. These documents include forms that we fill out, instructions that we follow, invoices that we enter into the computer system, holiday schedules that we check for the next day off, rate sheets that we use to bill our customers, and many more.

An error on any of these documents could lead to problems. Using an outdated version could lead to problems. Not knowing if we have the latest version or not could lead to problems. Just imagine us setting up a production line to outdated specifications or making strategic decisions based on a wrong financial statement.

ISO 9001 gives us tools (also referred to as "requirements") that show us how to control our documents.

ISO 9001 Documents

There are no "ISO 9001 documents" that need to be controlled, and "non ISO 9001 documents" that don't need control. The ISO 9001 system affects an entire company, and all business-related documents must be controlled. Only documents that don't have an impact on products, services or company don't need to be controlled - all others need control. This means, basically, that any business-related document must be controlled.

However, how much control you apply really depends on the document.

The extent of your approval record, for example, may vary with the importance of the document (remember, documents are approved before they are published for use).

The Quality Policy, an important corporate policy document, shows the signatures of all executives.

Work instructions often just show a note in the footer indicating approval by the department manager.

Some documents don't even need any approval record: if the person who prepared a document is also responsible for its content (e.g., the Quality Manager prepares instructions for his auditors), a separate approval is superfluous.

On the other hand, identifying a document with a revision date, source and title is basic. It really should be done as a good habit for any document we create.

Please note that documents could be in any format: hard copy or electronic. This means that, for example, the pages on the corporate internet need to be controlled.

Responsibility for Document Control

Document control is the responsibility of all employees. It is important that all employees understand the purpose of document control and how to control documents in accordance with ISO 9001.

Please be aware that if you copy a document or print one out from the Intranet and then distribute it, you are responsible for controlling its distribution! The original author will not know that you distributed copies of this documents, so the original author can't control your distribution.

Dating Documents

ISO 9001 requires to show on every document when it was created or last updated. Many of us may have thought to use our word processor's automatic date function for this, but... should we use the automatic date field on documents?

Generally not. If you enter the automatic date field into a document, the field will automatically be updated to always show the current date, no matter when you actually created or updated the document.

Example: For example, if you use the automatic date field in a fax and you save the fax on your computer for future reference, you won't be able to tell when you wrote the fax: when you open the fax on your computer, it will always show today's date.

The automatic date field is not suitable for document control. Therefore, as a general rule, don't use the automatic date field to identify revision status.

ISO 9001 Documentation

ISO 9001 documentation includes:

the Quality Procedures Manual, which also includes corporate policies and procedures affecting the entire company;
work instructions, which explain in detail how to perform a work process;
records, which serve as evidence of how you meet ISO 9001 requirements.

Policies and Procedures

Our ISO 9001 Quality Manual includes the corporate Quality Policy and all required ISO 9001 Procedures. While most procedures affect only managers, every employee must be familiar with the Quality Policy and with the Document Control procedures. The Quality Policy contains the corporate strategy related to quality and customer satisfaction; all other ISO 9001 documents must follow this policy. The Document Control procedures shows how to issue documents, as well as how to use and control documents.

Continuous Improvement

Implementing ISO 9001 is not a one-time benefit to a company. While you are utilizing the quality manual, quality procedures and work instructions in daily business activities, you are not only benefiting from better quality and increased efficiency but you are also continually improving. In fact, the ISO 9001 requirements are designed to make you continually improve. This is a very important aspect because companies that don't continue to improve are soon overtaken by the competition.

Thursday, June 20, 2013

Intelligent Search and Automated Metadata

The inability to identify the value in unstructured content is the primary challenge in any application that requires the use of metadata. Search cannot find and deliver relevant information in the right context, at the right time without good quality metadata.

An information governance approach that creates the infrastructure framework to encompass automated intelligent metadata generation, auto-classification, and the use of goal and mission-aligned taxonomies is required. From this framework, intelligent metadata enabled solutions can be rapidly developed and implemented. Only then can organizations leverage their knowledge assets to support search, litigation, e-discovery, text mining, sentiment analysis and open source intelligence.

Manual tagging is still the primary approach used to identify the description of content, and often lacks any alignment with enterprise business goals. This subjectivity and ambiguity is applied to search, resulting in inaccuracy and the inability to find relevant information across the enterprise.

Metadata used by search engines may be comprised of end user tags, pre-defined tags, or generated using system defined metadata, keyword and proximity matching, extensive rule building, end-user ratings, or artificial intelligence. Typically, search engines provide no way to rapidly adapt to meet organizational needs or account for an organization’s unique nomenclature.

More effective is implementing an enterprise metadata infrastructure that consistently generates intelligent metadata using concept identification. A profoundly different approach, relevant documents, regardless of where they reside, will be retrieved even if they don’t contain the exact search terms, because the concepts and relationships between similar content has been identified. The elimination of end-user tagging and the resulting organizational ambiguity enables the enriched metadata to be used by any search engine index, for example, ConceptSearch, SharePoint, Solr, Autonomy or Google Search Appliance.

Only when metadata is consistently accurate and trusted by the organization can improvements be achieved in text analytics, e-discovery and litigation support.

In the exploding age of big data, and more specifically text analytics, sentiment analysis and even open source intelligence, the ability to harness the meaning of unstructured content in real time improves decision-making and enables organizations to proactively act with greater certainty on rapidly changing business complexities.

To achieve an effective information governance strategy for unstructured content, results are predicated on the ability to find information and eliminate inappropriate information. The core enterprise search component must be able to incorporate and digest content from any repository, including faxes, scanned content, social sites (blogs, wikis, communities of interest, Twitter), emails, and websites. This provides a 360-degree corporate view of unstructured content, regardless of where it resides or how it was acquired.

Ensuring that the right information is available to end users and decision makers is fundamental to trusting the accuracy of the information and is another key requirement in intelligent search. Organizations can then find the descriptive needles in the haystack to gain competitive advantage and increase business agility.

An intelligent metadata enabled solution for text analytics analyzes and extracts highly correlated concepts from very large document collections. This enables organizations to attain an ecosystem of semantics that delivers understandable and trusted results that is continually updated in real time.

Applying the concept of intelligent search to e-discovery and litigation, traditional information retrieval systems use "keyword searches" of text and metadata as a means of identifying and filtering documents. The challenges and escalating costs of e-discovery and litigation support continue to increase. The use of intelligent search reduces costs and alleviates many of the challenges.

Content can be presented to knowledge professionals in a manner that enables them to more rapidly identify relevant information and increase accuracy. Significant benefits can be achieved by removing the ambiguity in content and the identification of concepts within a large corpus of information. This methodology delivers expediencies, and reduces costs, offering an effective solution that overcomes many of the challenges typically not solved in e-discovery and litigation support.

Organizations must incorporate an approach that addresses the lack of an intelligent metadata infrastructure. Intelligent search, a by-product of the infrastructure, must encourage, not hamper, the use and reuse of information and be rapidly extendable to address text mining, sentiment analysis, e-discovery, and litigation support.

The additional components of auto-classification and taxonomies complete the core infrastructure to deploy intelligent metadata enabled solutions, including records management, data privacy, and migration. Search can no longer be evaluated on features, but on proven results that deliver insight into all unstructured content.

Wednesday, May 29, 2013

Digital Assets Management System - Autonomy Virage MediaBin

Autonomy Virage MediaBin is the advanced and comprehensive solution to index, analyze, categorize, manage, retrieve, process, and distribute all types of digital assets within an organization.

Autonomy Virage MediaBin helps organizations with globally distributed teams to effectively manage, distribute, and publish digital assets used to promote their messaging, products, and brands.

Companies would benefit from higher-impact marketing and communications, greater agility, stronger brand equity, increased team productivity, and the security of knowing valuable corporate assets will be fully leveraged and preserved for the future. By providing self-service access to digital assets, marketing personnel no longer have to spend time fulfilling content requests.

Autonomy Virage MediaBin delivers rapid return on investment and can support implementations scaling up to the largest global enterprises.

Major Features:

Unified Management: a single environment which supports standardized and automated tagging to accelerate search and streamline the creation, management, delivery, and archival of all digital assets.

Intelligent Analytics: leverages Autonomy IDOL to automate manual processes such as metadata tagging, summarization, and categorization.

Next-Gen Rich Media Technology: leverages next generation video and speech analytics technology that extracts concepts to enable cross-referencing with other forms of information.

Effective and Agile Content Reuse: provides secure access to all content for all users. Internal and external teams can collaborate more effectively to improve coordination and productivity in all marketing programs.

Transform and Transcode on the Fly: Multi-threaded transformation task engine can handle large quantities of simultaneous complex transformations involving format conversions, color-space conversions, color adjustments, resolution, cropping, sizing, padding, watermarking, and a wide variety of advanced graphics adjustments that would normally require a user to open an editing application on their desktop.

Other Features:

browser based system;
permissions can be defined based on users roles or by folders; search incorporates permissions;
content can be pulled from CMS such as TeamSite and rendered on the fly;
each asset has unique ID which is passed over to TeamSite; TeamSite "knows" when there is a different or a new revision. If an asset gets updated in MediaBin, TeamSite gets notified;
has set of workflows such as approval and review, can define set of rules once assets are approved, they move to publishing area; also includes Process Studio which is the workflow tool and Template which is form builder;
assets can be uploaded by "drag and drop" and it can be Dragged and Dropped to Teamsite from MediaBin;
there is no limitation to size of the files;
upload can be automated for assets to go to specific folders;
after the download, assets will be preserved for individual users;
how assets are used is reported in Teamsite;
can pull content from SharePoint;
metadata is preserved, it is searchable and indexable.
content is automatically categorized by asset type and resolution; asset type is recognized on ingest, so no entering metadata is required;
Teamsite pulls images from MediaBin;
supports 29 languages;
ability to link assets together (for example: associated assets) using existing metadata;
ability to create a taxonomy of assets;
search includes saved searches, recent searches, both preset and executed searches, custom search;
ability to search for words in video and then go that place in the video;
once a user finds content, an action can be taken such as download, send it e-mail, send shortcut to content or add it to light-box which is defined by permissions;
there is Activity Manager which includes all taken actions and an ability to get to users' tasks.

Benefits:

eliminates human error and ensures quicker access to content through automatic metadata extraction and accurate search results;
reduces costs by automating the production, review, and distribution of digital assets;
encreases efficiency by providing users with self-service access at any time;
greater speed time-to-market while maintaining accuracy and consistency;
facilitates quick reuse and re-purposing of images, as well as rapid content creation;
produces higher-impact marketing and communications, greater agility, and stronger brand consistency;
increases compliance by security controlled access, complete audit trail, and control of licensed content.

Thursday, May 9, 2013

Search Engine Technology

Modern web search engines are highly intricate software systems which employ technology that has evolved over the years. There are few categories of search engines that are applicable to specific browsing needs.

These include web search engines (e.g. Google), database or structured data search engines (e.g. Dieselpoint), and mixed search engines or enterprise search.

The more prevalent search engines such as Google and Yahoo! utilize hundreds of thousands of millions of computers to process trillions of web pages in order to return fairly well-aimed results. Due to this high volume of queries and text processing, the software is required to run in a highly dispersed environment with a high degree of superfluity.

Search Engine Categories

Web search engines

These are search engines that are specifically designed for searching web pages. They were developed to facilitate searching through a large amount of web pages. They are engineered to follow a multi-stage process: crawling the infinite number of pages to skim the figurative foam from their contents, indexing the foam/buzzwords in a sort of semi-structured form (for example a database), and returning mostly relevant as links to those skimmed documents or pages from the inventory.

Crawl

In the case of a wholly textual search, the first step in classifying web pages is to find an "index item" that might relate expressly to the "search term". Most search engines use sophisticated algorithms to "decide" when to revisit a particular page, to check its relevance. These algorithms range from constant visit-interval with higher priority for more frequently changing pages to adaptive visit-interval based on several criteria such as frequency of chance, popularity, and overall quality of site. The speed of the web server running the page as well as resource constraints like amount of hardware or bandwidth also figure in.

Link map

The pages that are discovered by web crawls are often distributed and fed into another computer that creates a veritable map of uncovered resources. This looks a little like a graph, on which different pages are represented as small nodes that are connected by links between the pages. The excess of data is stored in multiple data structures that allow quick access to this data by certain algorithms that compute the popularity score of pages on the web based on how many links point to a certain web page, which is how people can access any number of resources concerned with diagnosing psychosis.

Database Search Engines

Searching for text-based content in databases presents few special challenges from which a number of specialized search engines developed. Databases are slow when solving complex queries (with multiple logical or string matching arguments). Databases allow pseudo-logical queries which full-text searches do not use. There is no crawling necessary for a database since the data is already structured. However, it is often necessary to index the data in a more economized form designed to inspire a more expeditious search.

Mixed Search Engines

Sometimes, searched data contains both database content and web pages or documents. Search engine technology has developed to respond to both sets of requirements. Most mixed search engines are large Web search engines, like Google. They search both through structured and unstructured data sources. Pages and documents are crawled and indexed in a separate index. Databases are indexed also from various sources. Search results are then generated for users by querying these multiple indices in parallel and compounding the results according to "rules".

Tuesday, April 30, 2013

Big Data and Content Management

There has been a lot of talk lately about big data. What is big data?

Big data is is a collection of data sets so large and complex that it becomes difficult to process using on-hand commonly used software tools or traditional data processing applications. The challenges include capture, governance, storage, search, sharing, transfer, analysis, and visualization.

What is considered "big data" varies depending on the capabilities of the organization managing the data set, and on the capabilities of the applications that are traditionally used to process and analyze the data set in its domain.

Big data sizes are a constantly moving target. As of 2012 ranging from a few dozen terabytes to many petabytes of data in a single data set. With this difficulty, new platforms of "big data" tools are being developed to handle various aspects of large quantities of data.

Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone. How does it apply to us and what we do in content management?

The sheer numbers, covered in most enterprise content management (ECM) analyst reports, also extend to all aspects of the information technology sector, prompting developers to create a new generation of software and technology or distributed computing frameworks in an effort to cope with this scalability phenomenon.

Content growth is everywhere. From traditional data warehouses to new consolidated big data stores, IT infrastructure must be ready for this continuing scale; it impacts the entire IT industry, especially ECM.

Content is getting bigger. Applications are growing more complex, challenging IT as never before. How will these changes impact content management technologies? It's difficult to predict exactly, but there are insights to be found and used to plan for the future.

ECM technology is evolving toward a platform-based approach, enabling organizations to make their own content-centric and content-driven applications smarter. Analysts, vendors and users all agree: The time for "out-of-the-box" CMS applications has passed. Now each project can meet specific needs and individual requirements.

Content and data, more often than not, come with embedded intelligence whether through adding custom metadata and in-text information or by leveraging attached media and binary files and it can be utilized, whether structured or unstructured.

This can be observed on many different levels across various domains. For instance, the arrival of what some have started to call "Web 3.0": the semantic Web and the related technology that promotes intelligence out of raw content through advancements like semantic text analysis, automated relations and categorization, sentimental analysis, etc. -- effectively, giving meaning to data.

More traditional ECM components, such as workflows, content lifecycle management and flexibility, demonstrate much of the same. Smart content architecture along with intelligent, adaptive workflow and processed or deep integration with the core applications within information systems are all making enterprise content-centric applications smarter and are refining the way intelligence is brought to content.

In short, content is getting smarter on the inside as much as on the outside.

In fact, such disruptive phenomena as Big Data or the new semantic technology on the scene are huge opportunities for enterprise content management solutions. They are bringing new solutions and possibilities in business intelligence, semantic text analysis, data warehousing and caching that require integration into existing content-centric applications, all without rewriting them.

As a result, Big Data and smart content will push more of enterprise content management toward technical features such as software interoperability, extensibility and integration capabilities.

These developments will also demand a clean and adaptive architecture that is flexible enough to evolve as new standards arise to bridge CMS and semantic technologies, as well as connectors, to a back-end storage system or connectors with text-analysis solutions.

This underscores the advancements made in the development of modular and extensible platforms for content-centric applications. Taking the traditional approach of employing large enterprise content management suites that rely on older software architecture will make it harder to leverage these new and nimble opportunities.

In order to get the most value out of smart content and refine methods of dealing with Big Data, enterprise content management architects must incorporate a modern and well designed content management platform upon which to build, one that not only looks at end-user features but stays true to the development side. Enterprise content management will not be reinvented; Big Data and smart content are evolutions, not revolutions, in the industry.

I will continue on this subject in my future posts.

Sunday, April 28, 2013

Optimize Web Experience Management

Leading enterprises strive to acheeve higher levels of customer engagement through online channels, and this means they must easily, quickly and cost effectively provide fresh, personal, relevant content anytime, anywhere, on any device, through a consistent and dynamic user experience.

Traditional web content management system (CMS) solutions are no longer sufficient, and a richer and broader range of capabilities that enable web experience management - managing and optimizing the site visitor experience across the web, mobile apps, social networks and more - must now be leveraged in this new era of engagement.

The Need for Web Experience Management

Over the last few years, the Internet has undergone a tremendous amount of fundamental change in its landscape - socia1, personal and mobile.

1. Social - The Web is becoming increasingly more social and much less anonymous. The power of sharing can enhance or destroy brands in seconds.

2. Personal - While the Internet is continuously expanding in terms of ubiquity, at the same time it's becoming much more local and much more personal in terms of user experience.

3. Mobile The growth of mobile access to the Internet is rapidly expanding to the point where access from tablets and phones will soon exceed that from desktops and laptops.

The very way we communicate with customers is changing, and when fundamental change like this occurs, those who recognize the change and move quickly to adapt will benefit the most.

A New Era of Engagement

Each of these trends reinforces the others and fuels further adoption and innovation. It is these technologies, the behaviors and capabilities they foster that have brought us to a new era which Forrester calls the "era of engagement."

Driving these trends are people - our friends, leads, customers, critics, and fans. This is our audience and the other half of the conversation, and in today's age of engagement, they want to participate and expect us to engage them on their terms, on their schedule, in the context of their location, in their language and optimized for their device. To effectively tackle this challenge of serving a mass audience with limited resources, enterprises require strategy and effective tools to help get the job done.

Web experience management (WEM) provides us with the tools to take on this otherwise daunting task. The capabilities of WEM allow you to create, manage and deliver dynamic targeted and consistent content across various online channels including your website, social media, marketing campaign sites, mobile applications, etc. It takes a lot more than a traditional Web CMS to meet these new demands.

Key Principles of Web Experience Management

To effectively implement WEM, enterprises must start with their business strategy and goals which should drive their messaging and engagement strategy and which in turn should drive their content strategy. In other words, the strengths, weaknesses, threats and opportunities that businesses face should be considered first and foremost.

Too often organizations fail to do this by jumping straight into a technology selection without due consideration of the business drivers. Around this foundation, we wrap the fundamentals of basic Web content management. It is important to remember that content is still king. Business users and marketers need easy to use, yet powerful, content authoring and publishing capabilities.

They need rich content models that allow them to create engaging visitor experiences, to easily create new content assets, to quickly find and re-purpose existing content, and to preview content and the site visitor experience for all online channels.

Upon this foundation, an effective WEM solution provides a comprehensive collection of capabilities that allow organizations to create, manage and deliver dynamic, targeted and consistent content and visitor experiences across multiple touch points -corporate website, dedicated marketing campaign sites, mobile applications, social media sites, etc.

While WEM requirements are going to vary from organization to organization, some of the most critical features needed by essentially all enterprises include content targeting and personalization, mobile device support, faceted search and navigation, multi-channel publishing, integrated Web analytics, and campaign management.

Tuesday, April 23, 2013

Knowledge Management Applications - Coveo for Service and Support

In my last two posts about Coveo products, I described Coveo search applications - Coveo for advanced web search and Coveo for advanced enterprise search. Today, I will complete describing Coveo products with Coveo knowledge management application - Coveo for service and support.

With Coveo, knowledge required to solve cases faster can be found wherever it resides, within and beyond the knowledge base. Many companies are challenged with the proliferation of data, in multiple systems, communities, on-premise and in the cloud. Knowledge is everywhere and hard to manage.

Coveo solves this challenge by placing information from anywhere, related to the agent’s context, directly in front of them. Coveo technology automatically "reads" case information, established context, and instantly shows contextually relevant content and experts directly within the CRM such as Salesforce, or within a separate Insight Console. Coveo creates information mash-ups regardless of where the information resides, combined with advanced enterprise search and navigation abilities that bring your entire knowledge ecosystem to your agents.

Such knowledge availability decreases case resolution time, increases first contact resolution, and empowers lower level agents to become productive faster and to solve more complex cases. The results show dramatic impact on contact center capacity and customer satisfaction.

Features

Solutions and experts from anywhere - Coveo automatically presents 360° views of customer, case, or product information and communications, as well as experts who can help. Using advanced data enrichment, solutions and customer insight can stem from multiple sources, across enterprise, community, and social content.

Advanced enterprise search and navigation - expanded views enable deep, broad, knowledge exploration for cases, securely, across any enterprise content.

United indexing - Coveo federates searches and mash-ups from cloud, enterprise, and social data securely and in real time—regardless of format or source. It indexes source data from Salesforce, SharePoint, databases, file shares, Exchange, Dropbox, Lithium, Gmail, etc.

Expertise finding - dynamically, through context and topics, from internal colleagues to external experts, Coveo locates people with experience relevant to each case and customer.

Customer is in the center - Coveo cuts across departmental and system silos and enriches cases with sales or engineering content, thus providing richer and more relevant customer interactions. Conversely, other departments benefit from information generated by agents to inform product development and sales.

Virtual interaction - consolidates all customer and prospect communication and interactions from any channel, bringing together opportunities, cases, transactions, e-mails, events, cases, calls, tweets, etc.

Customization - The intuitive admin interface enables customization of any objects and combinations of information, including custom fields.