Security Problems of Machine Learning Models

Artificial intelligence and machine learning often appear in the news these days. There have already been several articles devoted to the fact that it is possible to deceive machine learning models, articles about the limitations of deep learning, and how to hack neural networks. This article takes a look at machine learning from the point of view of computer security.

1. What Does Computer Security Have to do with Machine Learning?

By machine learning, we usually understand such algorithms and mathematical models that are capable of learning and acting without human intervention and progressively improve their performance. In computer security, various machine learning methods have long been used in spam filtering, traffic analysis, fraud prevention, and malware detection.

In a sense, this is a game where, by making a move, you expect the reaction of the enemy. Therefore, while playing this game, one has to constantly update and correct models, feeding them with new data, or even changing them completely.

For example, while antiviruses use signature analysis, heuristics, and manual rules that are difficult to maintain and expand, the security industry still argues about the real benefits of antivirus, and many consider antiviruses to be a dead product. Malefactors bypass all defense mechanisms, for example, by means of obfuscation and polymorphism.

As a result, future preference will be given to tools that use more intelligent techniques, such as machine learning methods. These allow you to identify new features automatically, can quickly process large amounts of information, generalize, and make quick and correct decisions.

It is important to note that on the one hand, machine learning can be used for protection and on the other hand, it can also be used for more intelligent attacks.

2. Vulnerabilities of Machine Learning Models

For any algorithm, it is very important to select all necessary parameters carefully. For a machine learning algorithm, the data on which it will be trained becomes crucial too. In an ideal situation, it is necessary that there is enough data for good training, but it rarely happens in real life.

When people say that a system or service is secure, they usually mean that it is impossible to violate the security policy within a given threat model. Today, a huge number of services operate on the basis of data analysis algorithms, so the risks are hidden not only in vulnerable functionality but also in data itself, on the basis of which the system can make wrong decisions.

The quality of a trained model is usually assessed by the accuracy with which it can classify data that the model has not “seen” before. For now, most quality assessments do not take into account harmful conditions (adversarial settings), which often go beyond the expected set of input data.

The overall system accuracy can be assessed based on the average number of correct and wrong decisions, while the security assessment should take into account the worst results. From the security easement point of view, we are interested in the system being mistaken. Accordingly, our task is to find as many vectors as possible that give incorrect results.

3. Is it Possible to Manipulate a Machine Learning Model to Conduct a Targeted Attack?

Understanding how the system learns, you can plan the attack and submit to the system pre-prepared data. Search engine analysis and optimization is a good example of such an approach. SEO specialists test and learn how intelligent search engine algorithms work and manipulate their website’s data to stand higher in search results. Security problems, in this case, appear when web surfers who trust Google or other search engines start to click on the first website which can be a phishing website stealing sensitive information.

For another example let’s see how biometric systems can be deceived. Such systems need to gradually update their parameters as small changes in a person’s appearance take place (natural aging.) This is an absolutely natural and necessary service functionality in this case. Using this property of the system, attackers can prepare false data beforehand and submit it to the biometric system, updating the model until it updates the parameters to a completely different person.

4. Is it Possible for an Attacker to Substantially Degrade System Performance?

It is important to note that the data set, on which the model shows the worst result, always exists. The task of an attacker is to find such data. This problem arises quite naturally from the fact that the machine learning model is often tested in a fairly static environment, and its quality is assessed based on the data on which the model was trained.

Let’s take a look at a system that often gives both, type I and type II errors. For example, the antivirus blocked your file because it considered it malicious (although it is not malicious) or the antivirus missed the file that was malicious. In both the cases, the user of the system considers it to be ineffective and may simply turn it off, although it is quite likely that a specific set of data cased the system to make a wrong decision.

The cost of type I and type II errors for each particular system may be different. The type I error can be cheaper for the antivirus because it is better to be safe and say that the file is malicious. If the user disables the system and the file really turns out to be malicious, then we can say that the antivirus “warned” the user and the responsibility lies with the user. If we take, for example, a system designed for medical diagnostics, then both types of errors will be quite expensive, because the patient, in any of the cases, is at the risk of improper treatment or even death.

5. How to Exploit the Limitations of Machine Learning Models?

Let’s see how attackers can use the properties of machine learning methods to disrupt the system without interfering with the learning process. It may seem that machine learning systems are protected from human intervention into the process of selection of specific data signs and features.

One can say that there is no human factor involved in making any decisions by the model. All the beauty of deep learning is that it is enough to submit practically “raw” data to the model, and the model itself, through multiple linear transformations, selects signs and features that it considers to be the most significant for making a decision. But is it really that good?

For example, we experimented with misguiding deep learning models using traffic signs. For a positive result, it is enough for attackers to find such areas on the object that knock down the classifier and it makes a mistake. The experiments were carried out on a “STOP” sign, which, after small changes, was qualified by the model as a “SPEED LIMIT 45” sign. Researchers tested their approach on other road signs and got positive results. These are quite often life situations. Road signs can be covered with mud, dust, or snow.

6. Classification of Existing Attacks

By Influence Type:
• Causative attacks affect the learning of the model through interference with the training data sample.
• Exploratory attacks use classifier vulnerabilities without affecting the training data set.

By Security Violation:
• Integrity attacks compromise the system through type II errors.
• Availability attacks cause the system to be shut down, usually based on type I and type II errors.

By Specificity:
• A targeted attack is aimed at changing the prediction of the classifier when working with a particular class.
• An indiscriminate attack is aimed at changing the responsedecision of the classifier to any class, except the correct one.

7. Attack Prevention and Protection

In the context of security objectives, the goal of machine learning models is to identify malicious events and prevent them from interfering with the system.

Today, there are not enough effective ways to make machine learning models work with 100% accuracy, but there are several ways that can make these models more resistant to malicious actors. Here is the main one: if it is possible not to use machine learning models in a sensitive environment, it is better not to use them, at least until better protection methods appear.

If the system is associated with performing important functions, for example, diagnosing diseases, detecting attacks on industrial facilities or driving an unmanned vehicle, the consequences of compromising the security of such a system can be catastrophic.

The problem of misclassifying malicious data samples is huge. The model did not see such samples in its training data set, so it will often be wrong. It would be good to supplement your training data set with all possible (and currently available) malicious data samples and prevent yourself from being deceived by at least them. But it is unlikely that you will be able to generate all possible malicious data samples.

You can also use the generative-consensual neural network, which in its structure consists of two neural networks – the generative and discriminative. The task of the discriminative model is to learn how to distinguish fake data from real one, and the task of the generative model is to learn how to generate such data in order to deceive the first model.

8. Possible Consequences

So, what are the potential security implications of using machine learning? There have been debates on who should be responsible for errors of machine learning models. There are three groups of people who may influence the final result:

• Those who develop the algorithm.
• Those who provide the data.
• Those who use the system, being its owners.

At first, it seems that system developers have a huge impact on the final result. They select all basic parameters, create the algorithm, and do the final testing. But in fact, developers only make a software solution that must meet the predefined requirements.

As soon as the model begins to comply with the requirements, and successfully passes several stages of tests, the work of the developer usually ends, and the model enters the stage of operation, where some “bugs” may appear.

Developers usually do not have the entire set of data at the training stage. This can be a possible reason for future bugs. Actually, there could be numerous reasons why bugs or errors appear. A very vivid example is the Twitter chatbot created by Microsoft, which has been trained on real data and began to post racist tweets. The algorithm learned from the data that it got, and naturally began to imitate that data.

It would seem that this is a great achievement of the developers, and everyone wanted and awaited such behavior. But the learning data turned out to be not appropriate from a moral point of view. And this chatbot turned out to be unusable – simply because it learned everything extremely well.

Source link