Fending off attacks on AI-supported software – Secure by Design

Securing software with AI requires separate consideration of input, output and data processing to protect against manipulation and deepfakes.

The omnipresence of artificial intelligence in the media has increased the pressure on the software industry to offer products with AI elements. Development teams are faced with the challenge of adding AI components to their products while protecting both their product and end users from attacks. While regular measures to increase IT security are already a hurdle for many companies, this is even more difficult for software with AI, as AI experts are only slowly beginning to systematically work on security issues and disseminate their findings.

In the development industry, with its often agile processes, IT security often lags feature development. OWASP, NIST AI RMF, Google CoSAI and others are trying to change this, but implementation is a challenge, especially for small and medium-sized companies. Even larger corporations find it difficult to explain to their customers that an early and extensive investment in secure AI represents a competitive advantage in the long term. However, corporations are more likely to have the option of hiring developers for further training or buying in external expertise.

Conversely, many customers see their demand for intelligent behavior as a simple feature request, as AI can be easily integrated via an API from external operators. But just because an API is easy to use doesn’t mean that the end product is automatically safe.

The AI Act supports risk assessment

The EU AI Act helps security managers with risk assessment within the EU. It addresses the problem of multimodal AI models, specifically regulating the issue of data fusion and the associated data protection problems. It also provides consultants with a motive for convincing clients to plan an AI project carefully: it applies to all providers of AI systems on the European market and provides for penalties of up to forty million euros or seven percent of annual global turnover. Whether and to what extent the respective system falls under the EU AI Act can be checked online. This tool also provides information on special requirements, such as an obligation for machine-readable labeling for AI-generated results.

The AI Act also defines which AI applications are permitted in the European Economic Area. A company should also critically scrutinize this when dealing with customer requirements, as otherwise there could be a consulting error for which the consulting company is liable.

As AI regulations are still being developed, developers should ensure that all AI functions can be switched off quickly without blocking the product.

Risk assessment – Technical and legal aspects

There are basically two ways to add artificial intelligence to a software product: as an in-house development or via an external AI provider. What both have in common is that the exact decision-making process in the AI model remains unclear. It works as a black box, which means that although it is known in principle how AI models work, the exact decision-making process is hidden. The results are not error-free and the reasons for a certain result are not always comprehensible.

Security architects view AI – in very simplified terms – as a module with its own input, output and data processing (see Figure 1). Data security is relevant for all three aspects, which in the case of AI also includes the underlying model, the training data and the physical memory. Another key aspect is the availability of the AI. Although this is the responsibility of the DevOps team, developers should bear in mind that a failure can lead to uncontrolled software behavior, which attackers can exploit. The teams should develop methods for the event that the AI no longer generates usable output. They should not rely solely on the error handling of the AI itself or that of the AI provider, as their error handling could also have failed or been manipulated.

In addition to the technical level, project teams must also consider the legal consequences, especially if user or company data is sent to external providers, possibly abroad. These providers may store data on their servers for the long term, including for AI training purposes. Only recently, cases became known in which data protection was violated on a large scale. When making requests to an AI, the system should therefore always check the transferred data.

Minimizing data also conserves resources

When integrating an external AI, the teams must consider additional aspects, for example in the event that an attacker has gained control of the output and input. This requires automatic sanity checks, among other things. For teams that develop their AI themselves, the security aspects of data processing are more extensive, as their security is entirely their own responsibility –, which also includes the training data sets and the model itself. In addition to the security of the physical storage and the servers, it is also important to protect the data from theft and manipulation. So far, reports of theft have primarily come from internal attackers, but they are also conceivable in principle from external attackers.

The AI must also be protected against incorrect and malicious user input. On the one hand, AIs are fundamentally sensitive with regard to data input. Secondly, they are usually multi-tenancy systems in which a user could try to access the data of others. Both require sanity checks for the user’s input and the AI’s own input.

Tools now exist to help with risk assessment, such as the OWASP Top 10 LLM Applications & GenAI. A guide from the UK government’s Department of Science, Innovation & Technology is somewhat more comprehensive, but is supplemented with many practical instructions. The National Institute of Standards and Technology (NIST) provides guidelines for the implementation of governance tasks.

Many leaks in AI systems exploit old problems that are already known in their own right: Lack of adequate access controls and sufficient encryption, inadequate authentication of all communication participants including software modules, and the definition, validation and sanitizing of inputs and outputs. Attackers often do not target the AI modules themselves, but the systems or web portals that provide them. Risks emanating from the AI itself and its black box are currently much rarer. These are mostly attacks that manipulate the quality function of the AI, i.e. that influence how the AI makes decisions. These attacks primarily attempt to manipulate training data sets or, in the case of adaptive systems, to input massive amounts of corrupting data to steer the system in a desired direction [1]. All protective measures should be taken permanently as part of AI lifecycle management and linked to the software release cycle.

The input of requests to the AI should always have a clearly defined structure that includes requirements for the expected number of requests, the speed of requests and any characters used. Some AI providers check the content of requests themselves, but depending on the scenario, this can also be useful within the actual software product to prevent users from entering sensitive information and to prevent prompt hijacking or code injections. The aim is to identify those requests that can manipulate the AI or even take control of it or its memory areas. Due to the diversity of applications, there is no comprehensive patent remedy for recognizing dangerous requests. Development teams should consider how attackers could manipulate the AI in their product and how this can be prevented. Bad user stories, for example, are suitable for this. Ideally, an external AI provider should have already implemented this.

Attacks on an AI system target not only the AI itself, but also the infrastructure and data (Fig. 2).

The development team should also be aware that external providers receive, process and often store or reuse end user data. In the event of data leaks, it may be relevant to prove whether this data has been leaked by the in-house software or the AI provider. The team must also consider encrypting the data transfer and authenticating the parties involved. Unfortunately, it has been shown several times in the past that not all AI companies comply with data protection regulations and use data for AI training purposes without authorization.

Multi-tenancy is associated with special security risks, particularly with regard to the isolation of clients. Although security is not normally one of the developers’ tasks, they should be aware of the risks and be able to take countermeasures (e.g. authentication or encryption).

Secure results with output handling

In line with input control, the development teams should clearly define the output of the AI and check it for compliance with this specification. There should also be fail-safe routines that intervene if the output does not meet the specifications. As with the input, the definition of the expected output is specific and the teams must define it in accordance with the customer’s specifications. Developers should consider how attackers can cause the AI to produce unforeseen output and what poses a risk to the product and end users. Routines should intercept such behavior. This can also be evaluated in the team, for example through bad user stories.

Known, generic risks arise, for example, from AI responses that are too long, too many or too fast: the classic denial of service. Developers should also always intercept responses with special characters if they do not correspond to the expected pattern. This serves to protect both the end user and their own product if it continues to work with the responses: Possible dangers include code injections, prompt hijacking, logic bombs and many more.

In principle, teams should limit automated use cases according to the available resources. A simple example of misuse is the automatic mass sending of AI-generated emails. For end users, output that encourages interaction with external sources, such as clicking external links or paying for something, is a potential risk. Development teams that have such use cases must control the AI output with additional routines that are independent of the AI.

The EU AI Act stipulates a further requirement for the output: AI-generated content must be labeled as such to make it more difficult for deepfakes to spread [2]. Attackers can also play out deepfakes for disinformation campaigns via vulnerabilities in the AI.

Communication and data security

One principle of IT security is that whenever communication takes place, it should be secured, which also applies to data exchange between software components. This is particularly relevant when data is not only processed locally. Development teams should pay particular attention to this aspect when integrating an external AI. It is very likely that an attacker will first try to access or manipulate an insecure data stream before attacking secure servers or the AI itself.

Adaptive AIs can learn from user input, meaning that sensitive data can enter the AI’s knowledge base and be passed on to other users. Developers can only counter this through meticulous data economy. Users should be made aware that the AI system processes and stores the data they enter. Once such data has been included in the AI, it is a challenge to completely remove the data from the knowledge base and the model again.

Internal data transfers should also only move the most necessary information at a technical level. This also includes technical data. Corresponding concepts should be incorporated into the requirements specifications and reviewed with each release.

The topic of data security for AI applications is very broad. Development teams should clearly define which aspects of data security they can cover themselves and which are not within their remit. In this case, they should determine who is then responsible (for example, the service provider) [3]. Clarifying the responsibilities and affiliation of domains is an essential part of the security and risk analysis. It is also important to visualize the data flow and to check whether attackers can access or manipulate data as part of regular analyses. An understanding of the data flow is also helpful in order to comply with the required transparency towards end users.

Machine-readable labeling of AI-generated data shows which content was created by real users and which by the AI. This is important in order to be able to identify incorrect AI-generated content and clean up the system. It also enables the identification of AI-generated deepfakes and allows the AI data flow to be traced. This makes it easier to analyze data leaks, manipulation and disinformation campaigns.

Researchers have been criticizing the inadequacies of testing AI systems for some time [4]. Companies often need dedicated software testers with AI experience, which is not easy to find. The main approaches to AI testing are to ensure that the end product reacts predictably even if the AI misbehaves, that it is not corrupted and that unacceptable behavior and responses from the AI are intercepted.

Weighing up potential damage sensibly

Development teams faced with the challenge of integrating AI into a product should proceed similarly to integrating a new framework. It is not just a simple feature request, but a much more complex undertaking with considerable reorganization and restructuring. The team is responsible for the implementation and should therefore insist on opportunities for further training. A careless implementation ultimately falls back on the entire company and can result in severe fines based on the EU AI Act.

It is a particular challenge to prevent the improper generation and distribution of deepfakes. Supposedly inconspicuous software products can fall into the crosshairs of politically motivated actors, who use AI applications primarily to disguise the origin of their disinformation campaigns. Even smaller applications suddenly have to face security requirements that they would have never been exposed to in the past. However, awareness of this is currently not very pronounced everywhere.

The entire system must not be allowed to fall into an uncontrolled or unexpected state as a result of AI. In addition to clear specifications for input and output, the AI should be isolated from the software product. The product must function at all times and deliver usable output even if the AI fails or produces output that violates the guidelines.

IT security is always a question of probabilities. It is impossible to achieve a system that is 100% secure at all times. The teams should weigh up which attack scenarios are most likely for the product and what damage could occur. Together with the customer, the team must develop a sensible solution that takes resources into account.

Sources

[1] K. Hartmann and C. Steup; Hacking the AI-the next generation of hijacked systems; in IEEE Proceedings of 12th International Conference on Cyber Conflict (CyCon); Tallinn, Estonia, 2020.
[2] K. Giles, K. Hartmann and M. Mustaffa; The Role of Deepfakes in Malign Influence Campaigns; NATO Strategic Communications Centre of Excellence; Riga 2021.
[3] K. Huang, J. Huang and D. Catteddu; GenAI Data Security, in Generative AI Security: Theories and Practices; Cham 2024, pp. 133-162.
[4] A. Aleti; Software Testing of Generative AI Systems: Challenges and Opportunities, in 2023 IEEE/ACM International Conference on Software Engineering: Future of Software Engineering (ICSE-FoSE); Melbourne 2023

About the Author

Kim Hartmann is Director of Cyber and Information Technology at the Conflict Studies Research Centre in the UK and an EU peer reviewer for projects in IT risk assessment, cyber and software security.

NOTE: This article was originally published in heise online.