Probably the most fundamental cybersecurity concern when it comes to AI is whether the technology will ultimately be a force for good or for evil; whether it will be more useful to cyber attackers or cyber defenders.
Digital defenders are, of course, deploying AI to defend against cyber attacks. But just as firms are using GenAI to enhance productivity, so too are hackers. Attacks once required considerable resources – perpetrators had to identify high-value targets, study patterns of communication and research company documents, for instance. But machines can now complete this prep work in a fraction of the time.
How to detect a deepfake
AI is both changing the scale at which traditional attacks can be launched and leading to the emergence of new threats.
For instance, the use of deepfakes – false, AI-generated images or videos of real people – is on the rise. A 2024 Ofcom report found that 60% of people in the UK have encountered at least one deepfake. By 2026, 30% of organisations will consider their current authentication or digital ID tooling inadequate to fight deepfakes, according to Gartner, a research consultancy.
This is bad news for businesses, which are already being targeted in customised phishing attacks that use the technology. Examples of successful deepfake attacks have made headlines. An employee in Hong Kong, for instance, transferred about £20m to cyber attackers after being bamboozled by a deepfake posing as a senior executive.
Is AI a blessing or a curse for cybersecurity?
Many experts worry that AI could be a massively damaging development in cybersecurity. With AI tools, malicious actors can engineer attacks much faster than before
Deepfakes once came with tell-tale signs that users were speaking with a digital impostor – say, glitching speech or a nose floating uncannily out of place. But as the technology improves, deepfakes are becoming nearly impossible to spot. So says Dr Andrew Newell, chief scientific officer at iProov, a digital authentication firm.
Marco Pereira, global head of cybersecurity at Capgemini, agrees. “If you have someone on a video call that looks like the CEO, sounds like the CEO, has the right background – all it takes to fool you is them saying, ‘Oh, my camera is not working well’,” he explains.
Some cybersecurity experts note that there are still some tell-tale signs to look out for – although these might not exist for much longer.
Simon Newman is CEO of the Cyber Resilience Centre for London, a government-funded not-for-profit body helping businesses and charities to improve their defences. He advises looking for details on the face of the person that appear unnatural – perhaps unusual lip colours, facial expressions, strange shadows or blurring inside the mouth.
Ask yourself: Do the lips appear to be moving as they would with the matching words? Do the surrounding facial expressions look natural?
Still, detecting the technical flaws in deepfakes will become increasingly difficult as the tools continue to develop. Maintaining high contextual alertness may therefore be the most effective way to counter the risk.
Instead of focusing on people’s features, consider instead the stated purpose of the call and how the participants are interacting. Does anything seem out of context?
As deepfakes become more sophisticated, then, spotting them may become less about trusting your eyes, and more about trusting your gut.
New versions of old attack methods
Democratised AI has also led to the creation of new, ‘smarter’ viruses and worms. An example is the Morris II computer worm, which uses GenAI to clone itself. Researchers at Intuit, Cornell Tech and the Technion Israel Institute of Technology conducted an experiment in which they used Morris II to break the defences of GenAI-powered email assistants using so-called poison prompts.
Emails stuffed with these prompts caused the assistants to comply with their commands, compelling the bots to send spam to other recipients and exfiltrate personal data from their targets. They then cloned themselves to other AI assistant clients, which mounted similar attacks.
The researchers hope this proof-of-concept worm will serve as a warning that might prevent the appearance of similar species in the wild. They have alerted the developers of the three GenAI models they’d successfully targeted, which are working to patch the flaws exposed by Morris II.
Attackers are also using AI to supercharge traditional threats – using ChatGPT to create more bespoke, targeted and grammatically correct phishing emails, for instance.
Moreover, GenAI could help criminals to make sense of metadata – the data about data. The content of a text message is data. Metadata includes information such as when the message was sent, where it was sent from, who sent it and to whom.
One piece of metadata on its own is pretty much worthless. But when volumes of metadata are analysed by machines, patterns emerge that are sometimes more revealing than the contents of the messages alone.
Christine Gadsby, chief information security officer at BlackBerry, says that because metadata is the language of machines, computing tools are very good at gathering and making sense of it. “AI will enable attackers to link this data to individuals,” she warns. “What would have taken a human two years to analyse will take two minutes with AI.”
But it’s not only smarter threats that security leaders must worry about. AI systems themselves can also present significant risks.
Securing data inputs
Foundation models, the bedrock of GenAI, are data-hungry. If businesses want to differentiate themselves, they must feed these models with proprietary information, including customer and corporate data. But doing so can expose this sensitive material to the outside world – and the bad actors operating in it – potentially contravening the General Data Protection Regulation (GDPR) in the process.
Dr Sharon Richardson, technical director and AI lead at engineering firm Hoare Lea, sums up the situation: “From day one, these models were a very different beast from a security standpoint. It’s hard to bake security into the neural network itself because its strength comes from hoovering up millions of documents. This is not a problem we’ve solved.
”The Open Worldwide Application Security Project, a not-for-profit foundation working to improve cybersecurity, cites data leakage as one of the most significant threats to the LLMs on which most GenAI tech is based. This risk drew considerable public attention in 2023 when employees at Samsung accidentally released sensitive corporate information via ChatGPT.
What is shadow AI?
Employees’ use of unvetted GenAI tools in the workplace can be a nightmare for data security
The task of safeguarding the data being used takes on a new meaning with the latest GenAI tools, since it’s hard to control how the information is processed. Training data can get exposed as these systems work to organise unstructured material. It’s why some businesses are focusing their efforts on securing inputs. Swiss menswear company TBô, for instance, carefully labels and anonymises information on customers before feeding this into its model.
Smart organisations are taking a multi-pronged approach to managing the risk. One measure is permissions-based access for specific GenAI tools, under which only certain people are authorised to view classified data outputs. Another control is differential privacy, a statistical technique that allows the sharing of aggregated data while protecting individual privacy. And then there is the feeding of pseudonymised, encrypted or synthetic data into models, with tools that can randomise data sets effectively.
Data minimisation is vital, stresses Pete Ansell, chief technology officer at IT consultancy Privacy Culture.
“Never push more data into the large language model than you need to,” he advises. “If you don’t have really mature data-management processes, you won’t know what you’re sending to the model.”
Understanding the attack surface that an LLM might expose is also important, which is why retrieval-augmented generation (RAG) is growing in popularity. This is a process in which LLMs reference authoritative data that sits outside the training sources before generating a response.
RAG users don’t share vast amounts of raw data with the model itself. Access is via a secure vector database – a specialised storage system for multi-dimensional data. A RAG system will retrieve sensitive information only when it’s relevant to a query; it won’t hoover up countless data points.
“RAG is really good from the perspectives of both data security and intellectual property protection, since the business retains the data and the library of information the LLM is referencing,” Ansell says.
But he adds that “best practice around identifiable personal information and cybersecurity should also apply to business-level data”.
Such techniques don’t just protect sensitive material from cybercriminals. They also enable businesses to lift and shift learning from one LLM to another since, in practice, it’s not possible to trace the data back to its original source.
Businesses can also improve their AI-related data security by creating a multi-disciplinary steering group, conducting impact assessments, providing AI awareness training and keeping humans in the loop on all aspects of model development.
One of the biggest challenges facing the sector is that sensitive corporate data still has to leave localised servers and be processed in the cloud at data centres owned by one of the tech giants, which control most of the popular AI tools.
“For a brief moment, data can be sitting on a server outside your control, which is a potential security breach,” Richardson says.
Open-source models are therefore becoming increasingly popular, as they enable IT teams to externally audit LLMs, spot security flaws and have them rectified by a developer community.
Bharat Mistry, field CTO at Trend Micro, an IT security company, says cyber attackers will soon begin targeting AI models themselves, if they are not already doing so.
For example, cybercriminals could infiltrate an organisation and corrupt its AI systems with dodgy data. After a brief period of havoc, the criminals would inform the organisation that they were responsible for the attack and demand a ransom to restore operations.
An over-reliance on AI could exacerbate the impact of such an attack, Mistry says. Even with powerful ransomware attacks, businesses were able to make last-ditch, paper-based contingency plans to stay operational. But operating on analogue, even temporarily, will be almost impossible as organisations become dependent on AI.
Attackers could also add an ‘extra layer’ to GenAI tools, enabling them access to all of the data entered into the system. In this case, the model would appear to operate normally; users would have no reason to distrust the tool and might upload all sorts of confidential information. But if a malicious actor has added a ‘man-in-the-middle’ on the user’s device, all the data fed into it will pass into the hands of the attacker. Employees working remotely are especially vulnerable to this type of breach.
AI’s inherent vulnerabilities
But how would an AI system be corrupted in the first place? “Prompt injection is currently the most common form of attack observed against LLMs,” explains Kevin Breen, director of cyber threat research at Immersive Labs. “The focus is on tricking the model into revealing its underlying instructions or to trick the model into generating content it should not be allowed to create.”
Another potential weakness stems from AI’s inability to access data and information that is more current than the system’s most recent training update. To counter this limitation, LLMs have an added capability to incorporate functions into the AI context through a process known as function calling.
What’s a prompt injection?
These new attack models threaten to turn AI’s capabilities against itself
Breen explains that accessing up-to-date weather information is a common example of such an operation. “Asking an application what the weather is like in London, for instance, will prompt the AI to tell the application what function to use and what data to send. As these functions are sent to the AI, they become part of the context.”
Malicious users can modify the context with a prompt injection and force the AI to list all of its functions, signatures and parameters, warns Breen. “If developers aren’t properly sanitising these results, this can lead to attacks like SQL injection or even code execution, if some functions are able to run code.”
And, since LLMs are used to pass data to third-party applications and services, the UK’s National Cyber Security Centre has warned that malicious prompt injection will become a greater source of risk in the near term.
For this reason, any business training LLMs on sensitive data such as customer records or financial information must be especially vigilant, explains Dr Peter Garraghan, a professor of computer science at Lancaster University. He adds that the risks of improperly secured AI extend beyond data leakage.
“Malicious actors can potentially exploit vulnerabilities to manipulate model outputs, leading to incorrect decisions or biased results. This could have severe consequences in high-stakes applications like credit scoring or content moderation.”