Tuesday, December 30, 2014

Latest Applications in Enterprise Search

In my previous post, I described the future of enterprise search. In this post, I will describe few new search applications that could be interesting.

Concept Searching

Founded in 2002, Concept Searching provides software products that deliver automatic semantic metadata generation, auto-classification, and powerful taxonomy management tools. Concept Searching is the only platform independent statistical metadata generation and classification software company in the world that uses concept extraction and compound term processing to significantly improve access to unstructured information. The Concept Searching Microsoft suite of technologies runs in all versions of SharePoint, Office 365, and OneDrive for Business.

The technologies are being used to improve search outcomes, deploy an enterprise metadata repository, enable effective records management, identify and secure sensitive information, improve governance and compliance, social tagging, collaboration, text analytics, facilitate eDiscovery, and drive intelligent migration.

Concept Searching, developer of the Smart Content Framework™, provides organizations with a method to mitigate risk, automate processes, manage information, protect privacy, and address compliance issues. This infrastructure framework utilizes a set of technologies that encompasses the entire portfolio of unstructured information assets, resulting in increased organizational performance and agility.

Lexalytics, Inc.

Lexalytics provides enterprise and hosted text analytics software to transform unstructured text into structured data. The software extracts entities (people, places, companies, products, etc.), sentiment, quotes, opinions, and themes (generally noun phrases) from text. Text is considered unstructured data which comprises somewhere between 31% and 85% of what is stored in any given enterprise.

Lexalytics is an OEM vendor of text analytics and sentiment analysis technology for social media monitoring, brand management, and voice-of-customer industries. The software uses natural language processing technology to extract the above-mentioned items from social media and forums; the voice of the customer in surveys, emails, and call-center feedback, traditional media, pharmaceutical research and development, internal enterprise documents, and others.

Lexalytics, provides a text mining engine that is used by a number of search partners like Coveo, Playence, and Oracle to add additional metadata to their search. This is additional intelligence around "just what do those words actually mean?" In other words, this engine is boosting the value of search by providing more information into the index. This enables other applications, and helps search be "smarter".


MaxxCAT provides enterprise search solutions for corporate intranets, web sites, databases, file systems and applications, and other environments that require rapid document retrieval from multiple data sources. The flagship products offered by MaxxCAT are the SB-250 series and the EX-5000 series network search appliances. Also available are series of cloud-enables storage appliances.

Basis Technology

Founded in 1995, this software company specializes in applying artificial intelligence techniques to understanding documents written in different languages. Their software enhances parsing tools by classifying the role of words and provides metadata on the role of words to other algorithms. Software from Basis Technology will, for instance, identify the language of an incoming stream of characters and then identify the parts of each sentence like the subject or the direct object.

The company is best known for its Rosette Linguistics Platform which uses Natural Language Processing techniques to improve information retrieval, text mining, search engines and other applications. The tool is used to create normalized forms of text by major search engines, and, translators. Basis Technology software is also used by forensic analysts to search through files for words, tokens, phrases or numbers that may be important to investigators.


Founded in 1991, this company specializes in text retrieval software. Its current range of software includes products for enterprise desktop search, Intranet/Internet spidering and search, and search engines for developers (SDK) to integrate into other software applications

LTU technologies

Founded in 1999, this company is in the field of image recognition for commercial and government customers. The company provides technologies for image matching, similarity and color search for integration into applications for mobile, media intelligence and advertisement tracking, ecommerce and stock photography, brand and copyright protection, law enforcement and more

Sematext Group, Inc.

This company's product SSA - Site Search Analytics - continuously monitors, measures, and improves the search experience. It identifies top queries, problematic zero-hit queries, common misspellings, etc. It measures and compares search relevance and improves conversion rates. It is available It is available on-premises and in the cloud.


This is a privately held software company which was founded in 2000 in Konstanz, Germany, with an additional office in the United Kingdom (Bristol). The company develops intelligent software for search and analysis of structured and semi-structured data.

Their product MatchMaker is the leading error-tolerant search & match platform for huge master data volumes. The multiple award-winning software technology thinks, searches and finds like a human – but dramatically faster, in much more complex configurations and with no serious data restriction using keys or similar methods. It is available on-premises and in the cloud.

Federal authorities, insurance agencies, ICT firms and more use this software to identity a resolution in diverse, data-intensive business processes such as input management, enterprise search and data quality. It has easy customization and integration.


Founded in 2005,this company provides enterprise semantic search technology based on artificial intelligence and natural language processing. It offers intuitive search solutions and intelligent content support for website and corporate Intranets.

Content Analyst Company

This is a privately held software company which develops concept-aware text analytics software called CAAT, which is licensed to software product companies for use in eDiscovery. In 2013, five CAAT-powered products were named in the Gartner eDiscovery Magic Quadrant Report, and the analyst firm 451 Group referred to CAAT as The Hottest Product in eDiscovery.

Content Analyst's CAAT analytics software is a machine learning system based on latent semantic indexing technology. CAAT provides several text analytics capabilities using both supervised learning and unsupervised learning methods including concept search, categorization, conceptual clustering, email conversation threading, language identification, near-duplicate identification, auto summarization and difference highlighting.


With SearchYourCloud and its patented, federated search technology, a single search request in Outlook simultaneously and transparently searches your email, desktop and all of your cloud storage sources and delivers highly targeted results. You get exactly the information you need with just one query.


Docurated aggregates all your documents in one place, turning them into a searchable and customizable database. Docurated will now provide Dropbox integration as well. It accelerates sales in companies looking for fast growth by making the best marketing content readily available to Sales around the world. Docurated works with your existing content stores and uses machine learning to enable your team to find and re-use the most effective content with no manual tagging or uploading.

This is the next generation visual knowledge management platform which solves the information retrieval problem for leading companies like Clorox, Omnicom, Netflix, Weather Channel, and many others. Docurated enables sales, marketing, and technology teams to surface and use the exact chart or slide they need, no matter where it is stored, without slogging through folders and files. Docurated seamlessly integrates with existing folder-based repositories.


Apache Lucene is a free open source information retrieval software. It is supported by the Apache Software Foundation and is released under the Apache Software License. While suitable for any application which requires full text indexing and searching capability, Lucene has been widely recognizedfor its utility in the implementation of Internet search engines and local, single-site searching.

At the core of Lucene's logical architecture is the idea of a document containing fields of text. This flexibility allows Lucene's API to be independent of the file format. Text from PDFs, HTML, Microsoft Word, and OpenDocument documents, as well as many others (except images), can all be indexed as long as their textual information can be extracted.

These are just few search applications that are currently on the market. There are many others. Choosing the right application is based on your organization's requirements.

No comments:

Post a Comment