Imagine using faulty information in creating a building design or developing a product or running a political campaign or formulating a new drug. That's exactly what can happen – with devastating results – when hackers or other malfeasants infiltrate an organization and corrupt its data.
To ensure the highest degree of data security, the integrity of electronic records must be reliable across their entire lifecycle – from the time a file is generated through transference and its storage in databases or archives. Complying with the alphabet soup of regulatory requirements regarding data use and protection, while a thorn in the side of most organizations, can go a long way in safeguarding critical information from malicious threats. But mushrooming data volumes and confusion over how to execute a compliance plan and who's responsible can be an overwhelming task for any organization.
Access to data is no longer restricted to, and controlled by, a few departments of an organization, says Ravi Rao, senior vice president, pre-sales at Infogix. The volume and variety of data that is increasingly available, as well as the need for rapid decision-making in this data-aware world, has necessitated broad access to, and dissemination of, data and information – not only within the organization but to the consumer as well, he explains.
"This has placed a bigger emphasis on data integrity than ever before," Rao says. "Organizations are being forced to be more accountable for the data content, mainly because there are so many more eyes examining and analyzing the vast amounts of data."
This means, he says, that it is no longer sufficient to have a reactive approach to data integrity. "It will negatively and tangibly impact the business. In addition, it may be too expensive or worse, not even feasible to try to fix data integrity issues after the fact." So, he adds, organizations are seeing the value of building in data integrity and quality as a consideration in the design/build/deploy phase.
"If we agree that data integrity means, in essence, data value trustworthiness (e.g., I entered the address on Tuesday, and I can trust that two years from now, that address will still be exactly as I entered it), then I would say we have a long way to go," says John Avellanet, managing director and principal, Cerulean Associates, a Williamsburg, Va.-based consultancy and the author of several books on compliance.
Michael Angelo, chief security architect, Micro Focus
John Avellanet, managing director and principal, Cerulean Associates
Lucas Moody, CISO, Palo Alto Networks
Ravi Rao, SVP pre-sales, Infogix
Josh Shaul, vice president of web security, Akamai
Oliver Tavakoli, chief technology officer, Vectra Networks
Michael Taylor, applications and product development lead, Rook Security
Part of that challenge is technological transformation, he says. "It's never very clear if one moves data from System A to System B if the data – and its associated metadata – will be complete, accurate, consistent," says Avellanet. He points out that he is not referring here to PC to PC, but more like SQL to Oracle or Mac to PC, or SQL 2005 to SQL 2008. "That cannot be a certainty today and must have some level of testing associated with it to ensure that the characteristics that make up trustworthy data are still present."
The data and metadata must all still be present and available, complete, accurate and attributable in these instances, he says. "Let's face it, you can't throw a stick without hitting a story about lost or truncated data during a data move or migration, so this is still a very common issue across all industries."
Part of that challenge is technological obsolescence, he says, as we don't know that a PDF or a Word doc or a .TIF or a MPEG-2 or -4 file is going to be readable 10 years from now, much less 30 years from now. And yet, he adds, many regulations and statutes around the world call for data retention for years – in some cases as long as 25 years.
"If I wrote a report in the most popular word processing software in 1994, WordPerfect, and went to open it today in the most popular word processing software today, Microsoft Word, could I open it?," he asks. "Likely no. Microsoft stopped supporting the WordPerfect file format years ago. And yet millions of companies have WordPerfect files they are required to retain sitting in their archives or on their networks today."
The challenge, Avellanet says, is how would one produce that record in court or in a government investigation. "Well, without significant cost, you wouldn't. And can you even guarantee that the file has corrupted over the decades? So there is a disconnect between what our regulators expect/require and what we can actually produce."
When it comes to safeguarding databases or email or cloud implementations, Avellanet believe it boils down to at least four elements:
Technological or automated controls – This is everything from periodic data integrity checks run by systems to checksums, to security controls, to good disaster recovery backup procedures, to good long-term electronic data archival processes (and making sure not confuse short-term, disaster recovery backups with long-term data archives), and so on.
Manual or procedural controls – You have to have policies and procedures, everything from a good data integrity practices policy to a procedure on sampling long-term data archives for stability and integrity, a procedure on scanning paper documents to turn them into certified or true copy digital documents, etc.
Contractual controls – What happens to my data if the hosting company goes bankrupt? If they get bought and the purchaser of the hosting company decides to shut off the servers and doesn't realize my data is on them? Is the hosting provider allowed to sub-contract my data off to another sub-contracted hosting company? Is my data allowed to leave the EU or North America? Am I allowed to audit the hosting site? Frankly, this is not an area of expertise or much knowledge for most corporate lawyers, so it's imperative to get your legal department involved sooner rather than later because they'll likely need to go to outside counsel for advice and expertise, and that takes time.
Ongoing monitoring and trending – There is obviously a lot that goes on here. The key is understanding that when you outsource your data to someone else, you still retain the original risk and you add risk because now you need to have ongoing oversight and monitoring into how your data is being managed by a firm that might be around the other side of the globe, that might not speak your same language, etc.
Let's assume big data can be broken into two areas, PII and technical, says Michael Angelo (left), chief security architect, Micro Focus. "The current attack focus is not about data integrity as much as being about data access. The direct target (currently) is theft for financial gain. If we look at theft, it has yet to evolve to a point where your data is modified as an attack."
This having been said, imagine PII being modified for impersonation or as an offensive – say to disallow you access to your identity, Angelo says. "Or, what would happen if the data specification for a product were to be modified? Could it be made to not function or simply fail? What happens if the decisions, derived from big data analysis, were derived from faulty data? They would be incorrect and any subsequent actions would not necessarily be correct," he says.
We might be naturally inclined to assume that each time data changes hands, it loses some of its original integrity, says Lucas Moody, CISO at Palo Alto Networks, a Santa Clara, Calif.-based network and enterprise security company. "Many of us grew up playing the game “telephone,” where a message is relayed around a circle until it returns to the original source, where it is validated or debunked. Usually, someone along the sequence hears the message wrong, or intentionally changes it to make the outcome of the game more enjoyable for all. That's people."
In the arena of data and information systems, Moody says we're talking about machines that rely on the accuracy and structure of data to perform their intended functions to achieve the desired outcome. "Machines only change data if we tell them to in support of a desired outcome, or if someone has intentionally manipulated the data to change the outcome."
Rao at Infogix (right) agrees that the increasing availability of data is creating never-ending headaches for those responsible for securing the data. "As we have seen in the last few years, systems are being hacked and compromised with alarming frequency," he says. "Cyber, network, system and data security will continue to be enduring challenges. There are established approaches and standards that one can and must employ to safeguard databases, emails and cloud implementations. As these evolve over time, it is important to stay updated in security policies."
Expertise is widely available to help create, deploy and maintain such policies, Rao says. However, he adds, an often overlooked aspect of safeguarding data, whether in the cloud or not, is being able to organize it in the right way, both logically and physically, to prevent unnecessary overlap of data across necessary boundaries. "For example, having multi-tenant capability on top of “clean,” verified data allows for separation of data such that one group/department does not access data that they do not need access to, without requiring the unnecessary cost overheads and other inefficiencies of physical separation of data," Rao says. "Again, such approaches to safeguarding data are highly dependent on the integrity of the data."