The dark side of AI

Could we one day see the benevolent AIs of the world matching wits with malicious machines?
Here’s what experts had to say… 

Remember those late-night cram sessions you suffered through in college whenever a big test was approaching? As stressful as those times were, they pale in comparison to the mental workout that IBM’s famous artificial intelligence engine Watson has been put through.

With the assistance of eight universities across the U.S., Watson last May enrolled in “cybersecurity school,” digesting thousands upon thousands of documents each month, collecting data that will eventually allow it to pass the ultimate test – helping security experts better comprehend emerging cyberthreats and stop them in their tracks.

Last November, Watson even began an “internship,” a beta program involving approximately 40 companies that are leveraging its capabilities in the real world. As part of the program, Watson will determine if an attack on a participant’s network is by a known piece of malware; if it is, Walson will offer actionable background on the threat.

“After that, Watson will be ready to graduate completely,” said Diana Kelley, global executive security adviser at IBM, in an interview with SC Media. (Update: Indeed, on Feb. 13, 2017, IBM announced the successful conclusion of its beta test and the general availability of its AI offering.)

When IBM originally trained Watson in medicine, the promise was clear: doctors could one day rely on the Jeopardy! champion to parse through millions of documents that medical professionals would never have time to review themselves. Watson could then use the data within to accurately diagnose patients based on their medical profiles and symptoms, and then recommend customized treatments.

The question is: what is the equivalent to this medical breakthrough in the cybersecurity world? Not just as it relates to Watson, but also to an array of other cutting-edge AI technologies breaking onto the scene.

The stakes are serious, as CISOs come to grips with the reality that human intelligence is drowning in threat data, log files and alerts. To stem the tide, manpower and machine power might just need to work together.

“You look at the skills shortage that we have right now… If you can have an analyst who was spending days trying to get educated about what a particular attack meant to the organization, and can now in hours or less get that at their fingerprints, that’s really very powerful,” said Kelley.

“Humans make mistakes. They can become alert blind, and often have pressures other than being ‘right’ that inform their decisions,” said Ryan Permeh, founder and chief cyber scientist at Cylance, whose advanced threat protections offerings are built on an AI engine. “The scope of decisions that need to be made has long been past where we can find enough qualified humans to make them.”

“Having a machine that can consistently and correctly make decisions on behalf of an operator, and do so in real-time at scale, is imperative for the next generation of defenses,” Permeh continued, speaking with SC Media.

cognitive, with a little “c”

In some corners, AI may still seem like a far-flung concept borne out of science fiction. But in cybersecurity it’s already a reality, even if it has a long ways to go to reach its full potential.

Machine-learning tools are already being used to replace traditional threat detection software solutions that no longer adequately defend against dynamic cyberattacks whose patterns and indicators of compromise evolve too quickly to keep up with them.

Rather than focus on attack signatures, these AI solutions look for anomalous network behavior, flagging when a machine goes rogue or if user activity or traffic patterns appear unusual. “A really simple example is someone with high privilege who attempts to get onto a system at a time of day or night that they never normally log in and potentially from a geolocation or a machine that they don’t log in from,” said Kelley.

Another example would be a “really rapid transfer of a lot of data,” especially if that data consists of the “corporate crown jewels.”

Such red-flags allow admins to quickly catch high-priority malware infections and network compromises before they can cause irreparable damage.

IBM calls this kind of machine learning “cognitive with a little ‘c’” – which the company was already practicing prior to Watson. Despite its diminutive designation, “little c” can have some big benefits for one’s network.

“A network really in its simplest form, is a data set,” one that changes with every millisecond, said Justin Fier, director of cyber intelligence and analysis at U.K.-based cybersecurity company Darktrace, whose network threat detection solution was created by mathematicians and machine-learning specialists from the University of Cambridge. “With… machine learning, we can analyze that data in a more efficient way.”

“We’re not looking for malicious behavior, we’re looking for anomalous behavior,” Fier continued, in an interview with SC Media. “And that can sometimes turn into malicious behavior and intent, or it can turn into configuration errors or it could just be vulnerable protocols. But we’re looking for the things that just stand out.”

An advantage of these kinds of AI solutions is that they often run on unsupervised learning models – meaning they do not need to be fed scores of data in advance to help its algorithms define what constitutes a true threat. Rather, they tend to self-learn through observation, making note of which machines are defying typical patterns – a process that Fier said is the AI determining its own “sense of self” on the network.

While Fier said that basic compliance failures are the most commonly detected issue, he recalled one particular client that used biometric fingerprint scanners for security access, only to discover through anomaly detection that one of these devices had been connected to the Internet and subsequently breached.

To cover up his activity, the perpetrator modified and deleted various log files, but this unusual behavior was discovered as well. The solution even found irregularities in the network server that suggested the culprit moved fingerprint data from the biometric device to a company database, perhaps to establish an alibi. “My belief is that somebody on the inside was probably getting get help from somebody on the outside,” said Fier, noting that it was a significant find because “insider threats are one of the hardest things to catch.”

Another client, Catholic Charities of Santa Clara County, an affiliate of CatholicCharities USA that helps 54,000 local clients per year, used anomaly detection to thwart an attempted ransomware attack only weeks after commencing a test of the technology. The solution immediately flagged the event, after a receptionist opened a malicious email with a fake invoice attachment. “I was able to respond right away, and disconnected the targeted device to prevent any further encryption or financial cost,” said Will Bailey, director of IT at the social services organization.

Little “c’s” benefits extend beyond the network as well. Kelley cited the advent of application scanning tools that seek out problematic lines of code in websites and mobile software that could result in exploitation. And Fier noted a current Darktrace endeavor called Project Turing, whereby researchers are using AI to model how security analysts and investigators work in order to make their jobs more efficient.

From “cognitive” to “Cognitive”

Some technologists believe that to truly fulfill its promise, AI solutions must graduate from little “c” to what IBM calls “Cognitive with a big ‘C.’”

Such solutions will be able to comprehend a mix of both structured data (e.g. data plugged into relational databases and spreadsheets) and unstructured data, including text-heavy reports written in natural-language, in order to make informed recommendations, diagnoses and even predictions.

Of course, this involves supervised training – a painstaking process during which AIs like Watson must process thousands of documents and essentially be taught how to contextualize terms as a human would. For instance, said Kelley, the term IP could mean “Internet Protocol” in one natural-language document, and intellectual property in another – but Watson needs to distinguish the difference.

“Training correctly requires a very focused team dedicated to finding the ‘truth’ on a specific problem,” said Cylance’s Permeh. “Each problem has different approaches, but a few key elements are necessary. Having enough realistic data to train is very important, as is having enough to test. Having a deep enough understanding of your data in a way to create effective representations is necessary as well.”

With that said, training your AI to solve a problem won’t be very effective if you haven’t properly defined the problem in the first place. “Overly generalized or fuzzy problems get weak answers,” Permeh cautioned. “Complex real world problems rarely fit into simple models, and so AI systems that are overly simplistic fail in undefined ways.”

Derek Manky, global security strategist at Fortinet, similarly cited the need for ample, high-quality intelligence as a key challenge for AI programmers.

“Cyberthreat intelligence today is highly prone to false positives due to the volatile nature of the Internet of Things,” Manky told SC Media. “Threats can change within seconds, a machine can be clean one second, infected the next, and back to clean again full cycle in very low latency. Enhancing the quality of threat intelligence is critically important as we pass more control to artificial intelligence to do the work that humans otherwise would do.”

Despite the laborious prep work that building an AI platform entails, many believe the end result is worth the Herculean effort.

Indeed, a study published in December 2016 by Recorded Future offered a tantalizing glimpse of AI’s future. The threat intelligence firm developed a supervised machine learning model that is able to predict future cybercriminal activity on certain IP addresses by combining historical data from threat lists and other technical intelligence with current-day information gleaned from open-source intelligence (OSINT) sources, including reports of neighboring IP addresses that exhibit malicious behavior.

In a 2016 trial of this cognitive learning technology – also known as a support vector network – more than 25 percent of 500 previously unseen IP addresses that the AI flagged as risky ultimately turned up reported by open-source intelligence as malicious within seven days.

For instance, Recorded Future’s predictive model flagged the IP address 88.249.184.71 with a high-risk score on Oct. 4. It took until Oct. 14 – a full 10 days later – before that address finally appeared on a threat list as the host of a command-and-control server linked to the DarkComet remote access trojan.

A second study that looked at historical IP address data covering the entire IPv4 space was able to predict 74 percent of future threat-listed IPs while maintaining a 99 percent precision rate.

“The predictions we make are good enough that… you may want to use that information to automatically block those addresses in your firewall,” said Staffan Truve, co-founder and CTO of Recorded Future, in an interview with SC Media.

If such foreknowledge is possible, then one can’t help but wonder what other exciting breakthroughs AI is capable of in the cybersecurity space.

To that end, Truve did some predicting of his own, claiming that as the quality and quantity of historical datasets increase, AI will one day be able to prognosticate which cyber targets are most likely to be attacked, and what vulnerabilities are likely to be exploited.

“Predictive is something that CISOs really, really want to get to as we [develop] more advanced analytics,” said Kelley. “Not just getting an alert… but to be able to predict, ‘Hey, this employee may be about to go route two weeks before they go rogue.”

Truve can also foresee AI helping create self-healing systems that don’t just recognize that an anomaly has occurred, but also know how to repair themselves. “The systems of the future need to be able to diagnose themselves and understand if they have been manipulated,” so they can choose the best course of action, he said.

Of course, in a simple sense, self-healing technology is already here: When a machine on a network is acting abnormally, some threat detection solutions are programmed to automatically perform mitigation through limited, preapproved actions. Rather than immobilize an entire company server, it might shut down the one troublesome endpoint, stopping a potential malware infection without impacting network productivity.

But as “Big Cognitive” evolves and becomes more reliable, will users be willing to remove the reins and let networks fully defend themselves?

The 2016 Def Con conference in Las Vegas offered the world a sneak preview of this scenario when it hosted the DARPA Cyber Grand Challenge, where the winning programmers from team ForAllSecure created a fully automated cybersecurity defense system capable of reverse-engineering an unknown binary.

“In the future, AI in cybersecurity will constantly adapt to the growing attack service,” said Manky. “Today, we are connecting the dots, sharing data, and applying that data to systems,” but eventually, “a mature AI system could be capable of making decisions on its own. Complex decisions.”

Still, Manky cautioned that 100 percent automation is not an attainable aspiration. “Humans and machines must co-exist,” he noted. Indeed, many experts prefer to let human security analysts have the final word.

Finally, it would be almost impossible to examine the future of machine learning without taking a moment to ponder perhaps the biggest cybersecurity holy grail of all: attribution. SC Media asked the experts in this feature if they could see a future in which AI provides investigators with the helping hand desperately needed to unearth hidden clues in code and confirm, with near certainty, the culpable hacking group’s identity.

Perhaps Truve answered best: “Algorithms should be able to do attribution as good as humans are doing it,” he said. Then again, he laughed, that’s not a very high bar.