Tuesday, March 25, 2014

Search Applications - Vivisimo

Vivisimo was a privately held technology company that worked on the development of computer search engines. The company product Velocity provides federated search and document clustering. Vivisimo's public web search engine Clusty was a metasearch engine with document clustering; it was sold to Yippy, Inc. in 2010.

The company was acquired by IBM in 2012 and Vivisimo Velocity Platform is now IBM InfoSphere Data Explorer. It stays true to its heritage of providing federated navigation, discovery and search over a broad range of enterprise content. It covers broad range of data sources and types, both inside and outside an organization.

In addition to the core indexing, discovery, navigation and search engine the software includes a framework for developing information-rich applications that deliver a comprehensive, contextually-relevant view of any topic for business users, data scientists, and a variety of targeted business functions.

InfoSphere Data Explorer solutions improve return on all types of information, including structured data in databases and data warehouses, unstructured content such as documents and web pages, and semi-structured information such as XML.

InfoSphere Data Explorer provides analytics on text and metadata that can be accessed through its search capabilities. Its focus on scalable but secure search is part of why it became one of the leaders in enterprise search. The software’s security features are critical, as organizations do not want to make it faster for unauthorized users to access information.

Also key is the platform’s flexibility at integrating sources across the enterprise. It also supports mobile technologies such as smart phones to make it simpler to get to and access information from any platform.

Features and benefits

1. Secure, federated discovery, navigation and search over a broad range of applications, data sources and data formats.
  • Provides access to data stored a wide variety of applications and data sources, both inside and outside the enterprise, including: content management, customer relationship management, supply chain management, email, relational database management systems, web pages, networked file systems, data warehouses, Hadoop-based data stores, columnar databases, cloud and external web services.
  • Includes federated access to non-indexed systems such as premium information services, supplier or partner portals and legacy applications through the InfoSphere Data Explorer Query Routing feature.
  • Relevance model accommodates diverse document sizes and formats while delivering more consistent search and navigation results. Relevance parameters can be tuned by the system administrator.
  • Security framework provides user authentication and observes and enforces the access permissions of each item at the document, section, row and field level to ensure that users can only view information they are authorized to view in the source systems.
  • Provides rich analytics and natural language processing capabilities such as clustering, categorization, entity and metadata extraction, faceted navigation, conceptual search, name matching and document de-duplication.
2. Rapid development and deployment framework to enable creation of information-rich applications that deliver a comprehensive view of any topic.
  • InfoSphere Data Explorer Application Builder enables rapid deployment of information-centric applications that combine information and analytics from multiple sources for a comprehensive, contextually-relevant view of any topic, such as a customer, product or physical asset.
  • Widget-based framework enables users to select the information sources and create a personalized view of information needed to perform their jobs.
  • Entity pages enable presentation of information and analytics about people, customers, products and any other topic or entity from multiple sources in a single view.
  • Activity Feed enables users to "follow" any topics such as a person, company or subject and receive the most current information, as well as post comments and view comments posted by other users.
  • Comprehensive set of Application Programming Interfaces (APIs) enables programmatic access to key capabilities as well as rapid application development and deployment options.
3.Distributed, highly scalable architecture to support large-scale deployments and big data projects.
  • Compact, position-based index structure includes features such as rapid refresh, real-time searching and field-level updates.
  • Updates can be written to indices without taking them offline or re-writing the entire index, and are instantly available for searching.
  • Provides highly elastic, fault-tolerant, vertical and horizontal scalability, master-master replication and “shared nothing“ deployment.
4. Flexible data fusion capabilities to enable presentation of information from multiple sources.
  • Information from multiple sources can be combined into “virtual documents“ which contain information from multiple sources.
  • Large documents can be automatically divided into separate objects or sub-documents that remain related to a master document for easier navigation and comprehension by users.
  • Enables creation of dynamic "entity pages" that allow users to browse a comprehensive, 360-degree view of a customer, product or other item.
5. Collaboration features to support information-sharing and improved re-use of information throughout the organization.
  • Users can tag, rate and comment on information.
  • Tags, comments and ratings can be used in searching, navigation and relevance ranking to help users find the most relevant and important information.
  • Users can create virtual folders to organize content for future use and optionally share folders with other users.
  • Navigation and search results can return pointers to people to enable location of expertise within an organization and encourage collaboration.
  • Shared Spaces allow users to collaborate about items and topics that appear in their individualized views.

Thursday, March 13, 2014

Compliance With Privacy Regulations

Recently, high-profile cases involving breaches of privacy revealed the ongoing need to ensure that personal information is properly protected. The issue is multidimensional, involving regulations, corporate policies, reputation concerns, and technology development.

Organizations often have an uneasy truce with privacy regulations, viewing them as an obstacle to the free use of information that might help the organization in some way.

But like many compliance and governance issues, managing privacy will offer benefits, protecting organizations from breaches that violate laws and damage an organization's reputation. Sometimes the biggest risks in privacy compliance arise from the failure to take some basic steps. A holistic view is beneficial.

Privacy Compliance Components

Rather than being in conflict with the business objectives, privacy should be fully integrated with it. Privacy management should be part of knowledge management program.

An effective privacy management program has three major components: establish clear policies and procedures, follow procedures to make sure that organization's operation is in compliance with those policies, and provide an oversight to ensure accountability. Example of questions to consider: is data being shared with third parties, why the information is being collected, and what is being done with it.

Expertise about privacy compliance varies widely across industries, corresponding to some degree with the size of an organization. Although large companies are far from immune to privacy violations, they might at least be aware and knowledgeable about the issue.

The biggest mistake that organizations make in handling privacy is to collect data without a clear purpose. You should know not just how you are protecting personal information but also why you are collecting it. It is important for organizations to identify and properly classify all their data.

International Considerations

Increasingly, organizations must consider the different regulations that apply in countries throughout the world, as well as the fact that the regulations are changing. For example, on March 12, 2014, the Australian Privacy Principles (APPs) will replace the existing National Privacy Principles and Information Privacy Principles.

The new principles will apply to all organizations, whether public or private, and contain a variety of requirements including open and transparent management of personal information. Of particular relevance to global companies are principles on the use and disclosure of personal information for direct marketing, and cross-border disclosure of personal information.

It is important to consider international regulations in those countries where an organization has operations.

Technology Role

The market for privacy management software products is still relatively small. The market for this software is expected to grow rapidly over the coming years. The current reform process for data protection has created a need for privacy managing technology.

Products from companies such as Compliance 360 automate the process of testing the risk for data breaches, which is required for the audits mandated by the Economic Stimulus Act of 2009. This act expanded the Health Insurance Portability and Accountability Act (HIPAA) of 1996 requirements through its Health Information Technology for Economic and Clinical Health (HITECH) provisions.

These provisions include increased requirements for patient confidentiality and new levels of enforcement and penalties. In the absence of suitable software products, organizations must carry out the required internal audits and other processes manually, which is time consuming and subject to errors.

Enterprise content management (ECM), business process management (BPM) and business intelligence (BI) technology have important role in privacy compliance because content, processes, and reporting are critical aspects of managing sensitive information.

As generic platforms, they can be customized, which has both advantages and disadvantages. They have a broad reach throughout the enterprise, and can be used for many applications beyond privacy compliance. However, they are generally higher priced and require development to allow them to perform that function.

Privacy in the Cloud

Cloud applications and data storage have raised concerns about security in general, and personally identifiable information (PII) in particular. Although many customers of cloud services have concluded that cloud security is as good or better than the security they provide in-house, the idea that personally identifiable information could be "out there" is unsettling.

PerspecSys offers a solution for handling sensitive data used in cloud-based applications that allows storage in the cloud while filtering out personal information and replacing it with an indecipherable token or encrypted value.

The sensitive data is replaced by a token or encrypted value that takes its place in the cloud-based application. The "real" data is retrieved from local storage when the token or encrypted value is retrieved from the cloud. Thus, even though the application is in the cloud, the sensitive information is neither stored in the cloud nor viewable there. It physically resides behind the firewall and can only be seen from there.

This feature is especially useful in an international context where data residency and sovereignty requirements often specify that data needs to stay within a specific geographic area.

Challenges for Small Organizations

Small to medium-sized organizations generally do not have a dedicated compliance or privacy officer, and may be at a loss as to where to start.

Information Shield provides a set of best practices including a policy library with prewritten policies, detailed information on U.S. and international privacy laws, checklists and templates, as well as a discussion of the Organization for Economic Co-operation and Development (OECD) Fair Information Principles. Those resources are aimed at companies that may not have privacy policies in place but need to do so to provide services to larger healthcare or financial services organizations.

Among the resources is a list of core privacy principles based on OECD principles. Each principle has a question, brief discussion and suggested policy. For example, the purpose specification principle states, "The purposes for which personal information is collected should be specified no later than the time of data collection, and the subsequent use should be limited to fulfilling those purposes or such others that are specified to the individuals at the time of the change of purpose." The discussion includes comments on international laws and a citation of several related rulings.

Plans for Future

Business users and consumers alike have become accustomed to the efficiency and speed of digital data. However, more strict regulations are inevitable. Organizations should become more aware of having to prevent privacy breaches, and to make sure they have the systems in place to do this. Companies should also be concerned about reputation damage, which can severely affect business. Along with reliable technology, the best way forward is to follow best practices with respect to data privacy. Technology is essential, but it also has to be supported by people and processes.