Big Data and security analytics collide
Conrad Constantine, research team engineer, AlienVault
Big Data - an easily digestible name for the emergence of commodity software designed to allow synchronous N-Dimensional Analytics - quite the mouthful to anyone without a background specialising in the data sciences. Data has always been big, an intrinsic side-effect of Moore's law can be expressed as "Utilization will always expand to fill capacity.” No, the real nature of Big Data are big queries - the ability to ask questions of our data that have been computationally unfeasible before.
Ask anyone working frontline security operations and analysis - we've had Big Data for years - terabytes of logs we need to sift through to find that single log entry that delivers the smoking gun to us - and we'll regale you with stories of waiting hours, days even, for that search to return results. If Big Data were nothing more than a leap beyond isometric increases in the speed of querying our vast repositories of data in accordance to their volume, the average security analyst would be quite happy with that.
Big Data will become "The next big thing" – a critical re-evaluation and re-tooling of our analytical abilities. This is not about being able to query more data, but being able to query all data. Beyond being able to 'grep' through log data faster, is the ability to distil everything we have ever recorded from our information systems, into information pictures that no single human mind could perceive from the uninstalled source material.
The convergence of data science to security analytics was not an overnight event, more so because it was not a creation of the information security world to begin with. The path of convergence first came with an overlapping field - fraud detection and investigation - where data analytics has been a key driver for many years now in identifying what constitutes normal and abnormal patterns of activity. For anyone who has ever found their debit card locked out after a transaction they consider normal, there's the data analytics in action, running into an edge case. These algorithms are refined over time, iteration by iteration, and their designers learn to ask ever more elegant questions about their datasets.
Big Data can achieve nothing by itself, it is merely an engine to enable the asking of better questions - questions that arise only through experience with real world data. To express those questions programmatically from Big Data systems requires a certain set of technical skills that are only hastily covered in the current educational tracks for information security. If Big Data security is going to do more than keep buzzword-pace with the rest of the technology world, it will inevitably draw upon prior expertise from other fields. True, they will have to acquire some of the experience and domain knowledge of the security field - a task I suspect will be far less challenging to people with a background in data science than for our current crop of security graduates to replicate in reverse.
The Hubris of the information security field, to believe it deals with entirely unique and unsolvable problems may finally see new light as other domains of expertise come to accept that security is everyone's problem. Information security has matured -after two decades of relevance we should expect nothing less - but are we following suit with it? Big Data was not our creation, and there exists far more talent for asking the right questions from data, outside of our field - if this is our new normal, the core technology that drives all workflow and action - how are we going to address that in education, training and certification? Information security expertise requires experience and competence across a wide variety of information technology domains, yet how will we address the incursion of a skill so few of us are qualified with beyond cursory familiarity, only to find ourselves exclaiming "Help, a data scientist took my security job!"?