Data growth for the average enterprise tops 40 to 60 percent per year, meaning more email, contracts, spreadsheets and miscellaneous unstructured documents on top of years, if not decades, of unmanaged unstructured data.
Mixed in with the lunch menus, vacation photos and now useless work communications are legal documents, contracts, ambiguous email and probably some personally identifiable information – all prime data for breaches, lawsuits and regulatory repercussions.
Finding each sensitive document and email in a massively growing data center is near impossible, but finding all of them requires a simpler strategy – stop keeping everything forever and what you need to keep manage it properly.
Less data equals less risk so to mitigate data risks, simply reduce Big Data. IT spends a significant part of their budget storing legacy backup tapes in offsite vaults, replicating content in archives, increasing server capacity and managing desktop storage to ensure everything is available and secure.
Nothing is ever deleted and as regulations, employees and technology change this unstructured aged data becomes a liability.
Consider companies with sales or customer service departments. Chances are within their email account, a customer has either sent them a credit card or Social Security number or they've sent a customer's information via email to another employee for processing or to make a correction. This is not PCI compliance and if that email server or backup tape were to be breached, that unmanaged data could lead to identity theft.
Not just email with personally identifiable information poses a liability, email can become misinterpreted, especially between former employees. What was once an inside joke or venting session between employees can emerge later and be taken out of context as harassment, an admission or misconduct and lead to lawsuits.
Then, of course, there are those ever-changing compliance regulations. Every sector has different rules governing the retention of documents and written communications. There are also regulations including healthcare's Health Insurance Portability and Accountability Act of 1996 (HIPAA) and the financial industry's Truth in Lending Act (TILA) that regulate what should be included or excluded from these communications. Keeping files that predate regulations as-is or past retention periods leaves the door open for compliance violations.
All of those risks could be greatly reduced if the information, which either no longer holds business value or should be in a proper archive, didn't exist or was properly managed and secured. Solving sensitive data issues is as simple as IT changing their storage policy from store everything to store and preserve only what you need to.
Start the process by classifying data into categories: sensitive, abandoned, duplicate, personal or active. Classifying data enables companies to understand what exists and what to do with it. Sensitive data needs to be secured or deleted according to requirements, abandoned or unaccessed data can be purged, likely along with duplicate and personal data.Less data is easier and less expensive to manage, track and be accounted for. It's time for IT to protect their data by remediating legacy data on backup tapes, cleaning out user share servers and purging what they can. By changing the “keep everything forever” storage strategy, Big Data risks can permanently reduced.