A few of the ongoing issues include big data, master data management (MDM) and how to deal with unstructured data and records in unusual formats such as graph databases.
Records are kept for e-discovery, compliance purposes, for their business value, and sometimes because no process has been implemented for systematically removing them. This might be a double-edged sword: getting rid of data makes IT nervous, but there are times when records should be dispositioned.
Data stored in data lakes is largely uncontrolled and typically has not had data clean up processes applied to it. Data quality for big data repositories is usually not applied until someone actually wants to use the data.
Quality assurance might include making sure that duplicate records are dealt with appropriately, that inaccurate information is excluded or annotated and that data from multiple sources is being mapped accurately to the destination database or record. In traditional data warehouses, data is typically extracted, transformed and loaded (ETL). With a data lake, data is extracted (or acquired), loaded and then not transformed until required for a specific need (ELT).
MDM is a method for improving data quality by reconciling inconsistencies across multiple data sources to create a single, consistent and comprehensive view of critical business data. The master file is recognized as the best that is available and ideally is used enterprise-wide for analytics and decision making. But from records management perspective, questions arise, such as what would happen if the original source data reached the end of its retention schedule.
As a practical matter, a record is information that is used to make a business decision, and it can be either an original set of data or a derivative record based on master data. Therefore the “golden record” that constitutes the best and most accurate information can become a persistent piece of data within records management system.
Unstructured data challenge
A large percentage of records management efforts are oriented toward being ready for e-discovery.
There is the more of a problem in the case of unstructured data than in MDM. MDM has gone well beyond the narrow structure of relational databases and is entering the realm of big data, but its roots are still in the world of structured databases with well-defined metadata classifications, which makes records management for such records a more straightforward process.
The challenge with unstructured data is to build out the semantics so that the content management or records management and data management components can work together. In the case of a contract, for example, the document might have many pieces of master data. It contains transactional data with certain values, such as product or customer information, and a specialist data steward or data librarian might be needed to tag and classify what data values are represented within that contract.
With both the content and the data classified using a consistent semantic, it would be much simpler bringing intelligent parsing into the picture to bridge the gap between unstructured and structured data. Auto-classification of records can assist, although human intervention remains an essential element.
Redundant, obsolete and trivial information constitutes a large portion of stored information in many organizations, up to 80%. The information generated by organizations needs to be under control whether it consists of official records or non-record documents with business value. Otherwise, it will accumulate and become completely unmanageable. On the other hand, if organizations aggressively delete documents, they run the risk of employees creating underground archives of information they don’t want to relinquish, which can pose significant risks. Companies need to approach this with a well thought out strategy.
The system should allow employees to easily save documents using built-in classification instead of a lot of manual tagging. It is important to make the system intuitive enough for any employee to use with just a few seconds of time and a few clicks of the mouse.
The value of good records management needs to be communicated in such a way so that employees understand that it can actually help them with their work rather than being a burden. A well-designed system hides the complexity from users and puts it in the back end.
Studies of records management consistently show that only a minority of organizations have a retention schedule in place that would be considered legally acceptable and that some organizations have no retention schedule at all. Even if a schedule is in place, compliance is often poor.
A strategy should be developed to reconcile dilemma between keeping everything forever in order to extract business value from it and using records and information management to effectively get rid of as much information as soon as possible.
From a business perspective, the potential upside of retaining corporate records so they can be used to gain insights into customer behavior, for example, may outweigh the apparent risks that result from non-compliance.
The highest value is within records management framework for understanding and classifying information so that its business value can be utilized.
If organizations view records management as a resource rather than a burden, it can contribute to their success. In many respects, the management of enterprise information is already becoming more integrated and less siloed. For example, most enterprise content management (ECM) systems now have records management functionality. The same classification technology used for e-discovery is also used for classification of enterprise content. Seeing records management as part of that environment and recognizing its ability to enrich the understanding of business content as well as ensuring compliance can support that combination.
Governance can be a unifying technique that provides a framework to encompass any type of information as it is created and managed. Governance is a set of multidisciplinary structures, policies and procedures to manage enterprise information in a way that supports an organization’s short term and long term operational and legal requirements. It is important to consider the impact of all forms of information, from big data to graph data. Within a comprehensive strategy of governance, records management is successful.