Machine learning can protect companies from phishing, mobile threats, and plant breakdowns

Since the 1950s, scientists have been actively studying the capabilities of computer intelligence. Over the last 70 years, the concept of machine learning (ML) has developed from a theory to a technology actively used in the wild that allows program autonomy for decision-making, reducing the amount of manual work needed.

Security pros use machine learning to boost and automate malware detection, among other things. We also use machine learning against advanced email phishing. Sophisticated, accurately prepared phishing letters are effective ways to trick a specific organization or a user. Attackers disguise these messages as emails from new online services, exploiting popular events, even the pandemic. In the first quarter of 2020, there were many emails circulated with a request to transfer money to help combat COVID-19. Through the business email compromise technique, criminals gain employees’ trust via email correspondence. They disguise themselves as a third party, contractor or even a colleague and make targets do what the criminals want.

To protect users from such attacks, a security solution should quickly analyze all parameters of the email, including the content and technical characteristics, to detect if it’s malicious or not. Machine learning can take care of this.

In this case, we should use two machine learning models. One model will automatically analyze the technical parameters of emails, such as technical headers. It references hundreds of millions of metadata records from real emails and learns to recognize the combinations of technical traces that prove that the email is malicious, but it’s not enough information to make a verdict.

The second model detects the malicious nature of an email based on its content. To achieve the desired emotional effect, attackers use emotive language, as well as a clear call to action, something like: “your parcel couldn't be delivered, update your data here” in their text. The model recognizes such words and phrases typical for phishing letters.

The two models then correlate both results and make the final decision of whether or not it’s a phishing email, saving the user from opening it.

Machine learning against mobile threats for Android

In 2020, Kaspersky researchers detected an increase of 2 million more mobile threats than in the previous year, totaling more than 5 million overall. One of the key tasks within mobile protection is to secure against unknown malicious objects which have recently appeared in the wild.

On iOS devices, it’s only possible to install apps for a wide audience from the Apple App Store. On Android devices, users can install apps from a variety of sources and app markets. Unfortunately, cybercriminals sometimes exploit this by posting malware in apps disguised as games, useful software, porn, and so on. To detect the threats effectively and quickly, security teams need machine learning.

A machine learning agent on a user’s device scans every app as it’s downloaded for specific features such as required access permissions or numbers and sizes of internal structures. The metadata gets sent to the cloud-based ML model that then decides if this set of parameters causes the app to be classified as malicious or not. The model then sends a response indicating whether it’s a malicious file or not, and the protection product on the device decides to block the app’s download and installation.

This type of machine learning analysis requires a lot of computing resources, much more than a mobile device has available, and that’s why security teams leverage the cloud for this process.

Machine learning can prevent plant breakdowns

Equipment malfunctions, misconfigurations, human error, or hacker attacks can all cause the breakdown of industrial machinery. If any of these situations happen, it’s better to detect the deviation in production processes as soon as possible, otherwise an incident can spiral out of control.

The early symptoms of an incident are virtually impossible to detect by threshold monitoring or human operators. When thousands of telemetry readings come in every second, even an experienced operator can focus on a few patterns and overlook the rest.

Here’s where machine learning for anomaly detection (MLAD) comes in. The neural network can analyze a massive amount of telemetry data, absorb all aspects of the machine’s operation and thoroughly learn how the machine behaves under the normal conditions such as how the signals change over time and how they correlate with each other.

Once the ML model gets trained, the model switches to anomaly detection mode. It then receives telemetry in real-time, and if the divergence between the model and the observation rises above a certain threshold, the software deems the machine’s behavior anomalous and raises an alarm. The model gives an early warning of attacks, malfunctions, or mismanagement before any other instrument can spot the problem. This way, it helps to minimize damage and prevent a plant’s breakdown.

Machine learning against advanced cyberattacks

In some cases, security teams can use machine learning techniques to complement human intelligence against advanced threats, such as in managed detection and response (MDR) services.

Within an MDR service, an external security operation center (SOC) helps business customers respond to advanced cyberattacks. It receives alerts from the customer’s endpoints and investigates them to find traces of attacks, which it then reports back to the customer with response actions. SOC experts analyze some threat samples manually, but given the scale, they physically cannot look at each and every alert.

Machine learning can take on this burden. It automatically filters out alerts of no interest for SOC analysts, sets alert importance levels and gives hints for analysis saving their capacity and minimizing the response times.

During the training mode, model analyses alerts and scores them. The higher the score, the greater the probability that the alert should get reviewed by experts. Alerts with scores above a certain threshold are sent to SOC analysts who label them manually and enrich training data for the ML model.

In combat mode, the model resolves some alerts and prioritizes the rest for manual processing: Those with the highest score are put at the head of the queue for processing. This queue strategy reduces the average processing time of alerts and allows the offering to deliver the best SLA.

These are just a few interesting cases of how machine learning serves cybersecurity goals, but we believe that the field will continue to expand. Developing ML techniques in products can make cyber protection more intelligent, faster, and efficient.

Randy Richard, vice president of enterprise sales, Kaspersky