10 Cyber Attacks Machine Learning Can Help Prevent

Not even Cersei Lannister's scheming or Sir Jorah's father-like protectiveness could have prevented attackers from breaching HBO's network and stealing 1.5 terabytes of data (including unreleased Game of Thrones episodes). Machine learning, however, may have offered a more sound defense of HBO's virtual fortress.

Artificial intelligence (AI) and machine learning (ML) are the topics of much debate, especially within the cybersecurity community. Is machine learning the next big security frontier? Is AI ready to take on machine learning-driven attacks? Is AI even ready for use, generally speaking? No matter your conviction about whether machine learning is cybersecurity's savior, two things remain true: There is a place for analytics in security, and there are specific use cases where machine learning represents the best answer we have today.

Despite reports of hackers' “sophisticated” intrusion methods, it is likely the hacker or group of hackers who oh, so cleverly go by the moniker “little.finger66” used a commonly seen attack vector to penetrate the silver screen giant's system.

The following use cases are not exhaustive. They represent common security threats that affect every business. Machine learning may or may not be cybersecurity's panacea, but it can certainly help in these scenarios.

Use Case 1: Spear Phishing

Phishing campaigns are some of the most common and successful attack vectors today. These attacks take advantage of individuals' familiarity with communication tools like social media and email to send unwitting recipients malicious content via attachment or link. The effectiveness of this attack relies on the ability of attackers to mislead end users into clicking or download malicious payloads and subsequently bypassing internal controls. Current additions of destructive and ransomware payloads make this attack even more serious.

Organizations can detect these threats by capturing metadata from emails, without compromising users' privacy. By looking at email headers and a subsampling of body data, machine learning algorithms can learn to identify patterns that reveal malicious senders' emails. By extracting and labeling these micro behaviors, we can train our models to detect if a phishing attempt has occurred. Machine learning tools can build, over time, a graph based on the trustworthiness of senders.

Use Case 2: Watering Hole

Built similarly to phishing attacks, watering holes appear to be legitimate websites or web applications. However, these sites or apps are either real and have been compromised, or they are fake sites or apps designed to lure unsuspecting visitors into entering personal information. This attacks also relies in part on the ability of attackers to mislead users and to effectively serve exploits.

Machine learning can help organizations benchmark web application services by analyzing data like path/directory traversal statistics. Algorithms that learn with time can identify interactions that are common to those of attackers or malicious websites and apps. Machine learning can also monitor for behavior of rare or unusual redirect patterns to and from the site's host, as well as referrer chains — all of which are typically risk indicators.

Use Case 3: Lateral Movement

Rather than one specific type of attack, lateral movement attack vectors represent an attacker's movement across a network as they look for vulnerabilities and apply different techniques to exploit those vulnerabilities. Lateral movement is particularly indicative of risk escalation along the kill chain — an attacker's movement from reconnaissance to data extraction — especially when attackers move from low-level users' machines to those of more important personnel (who have access to valuable data).

Network traffic input logs can tell you a lot about visitors' interactions with a website. Machine learning-informed contextualization of this data can offer a dynamic view of normal traffic data. With a better understanding of typical traffic flow, algorithms can perform change-point detection (i.e., they can identify instances when the probability distribution of a given traffic pattern changes and becomes unlikely based on “normal” traffic activity) to detect potential threats.

Use Case 4: Covert Channel Detection

Attackers using covert channels transfer information via channels that are not intended for communication. Using covert channels allows attackers to maintain control of compromised assets and to use tactics that allow for the execution of attacks over time, undetected.

Attacks using covert channels often depend on visibility of all domains across a given network. Machine learning technology can ingest and analyze statistics about rare domains. With this information, security operations teams can more easily work to cloud attackers' visibility. Without a holistic view of the network they intend to attack, it becomes more difficult for cybercriminals to keep their attacks moving forward along the kill chain.

Use Case 5: Ransomware

Ransomware acts the way the name sounds. It is malware that wipes drives and holds infected devices and machines ransom in exchange for the user's encryption key. This form of cyberattack either holds the information until a user gives up their key, or, in some cases, it threatens to publish the user's personal information if they do not pay the ransom.

Ransomware presents a challenging use case because the attacks often leave network activity logs with a dearth of evidence. Machine learning technology can help security analysts track micro-behaviors associated with ransomware, such as entropy statistics or processes that interact with the entire file system in question. Organizations can focus machine learning algorithms on the initial infection payload in an attempt to identify these shards of evidence.

Use Case 6: Injection Attacks

The Open Web Application Security Project (OWASP) lists injection attacks as the number one Most Critical Web Application Security Risk. (Note: The current version of the OWASP Top-10 has been rejected, and the organization has reopened a data call and survey for security professionals). Injection attacks allow attackers to supply malicious input to a program. For instance, attackers will input a line of code into a database that, when accessed by the database, modifies or changes data on a website.

Database logs are another source of information that can help identify potential attacks. Organizations can implement machine learning algorithms to build statistical profiles of groups of database users. Over time, the algorithms learn how these groups access individual applications in the enterprise and learn to spot abnormalities in those access patterns.

Use Case 7: Reconnaissance

Before launching an attack, hackers perform extensive reconnaissance on a target or group of targets. Reconnaissance includes probing networks for vulnerabilities. Attackers will conduct recon at the perimeter of a network or within the local-area network (LAN). Typical reconnaissance detection involves signature-matching technology that hunts through network activity logs for repeated patterns that might represent malicious behavior. However, signature-based detection will often set off a string of noisy, false alarms.

Machine learning can be a proverbial compass for the topology of network data. Trained algorithms can develop a graph of this topology to identify the spread of new patterns more quickly than signature-based methods. Implementing machine learning also reduces the amount of false positives, allowing security analysts to spend time addressing the alarms that actually matter.

Use Case 8: Webshell

Webshells, as defined by the United States Computer Emergency Readiness Team (US-CERT), are “script[s] that can be uploaded to a web server to enable remote administration of the machine.” Via remote administration, attackers can initiate processes like database data dumps, file transfers, and malicious software installation.

Targets of webshell-using attackers are often backend eCommerce platforms, through which attackers target shoppers' personal information. Machine learning algorithms can focus on statistics of normal shopping cart activity then help identify outliers or behaviors that shouldn't be occurring with such frequency.

Use Case 9: Credential Theft

A few high-profile attacks, including virtual private network (VPN) compromises, have been the result of credential theft. Credential theft is often accomplished using tactics, such as phishing or watering holes, whereby attackers extract login credentials from victims in an attempt to access sensitive information an organization maintains.

Internet users — consumers — often leave behind login patterns. Websites and applications can track locations and login times. Machine learning technology can track those patterns and the data that comprises those patterns to learn about what sort of user behavior is normal, and that which represents potentially harmful activity.

Use Case 10: Remote Exploitation

Finally, many attack patterns utilize remote exploitation. These attacks often operate via a series of malicious events that target a system to identify vulnerabilities then deliver a payload (like malicious code) to exploit the vulnerability. Once the attack drops the payload, it executes code within the system.

Machine Learning can analyze system behavior and identify instances in which sequential behavior does not correlate with typical network behavior. Algorithms that have learned over time can then alert security analysts about the expected delivery of an exploitation payload.

The Discussion Doesn't End With Machine Learning, but It Should Start With It

Accurate cybersecurity analytics systems must be the cornerstone of modern security operations centers. But, developing accurate analytics is impossible without data samples. Security teams that adopt a machine learning mentality, as well as implement machine learning technology, can more quickly address the various types of attacks discussed above. Machine learning, or any technology for that matter, will never be the end-all and be-all of any industry. It does offer an alternative, open-source philosophy to identifying and dealing with cyberattacks that can improve upon many of the methods currently used today.