Blog

AI Hallucinations: A Growing Cybersecurity Threat?

From existential risks to misinformation, the emergence and rapid improvement in generative AI come with many concerns. With powerful generative models like ChatGPT available to almost anyone online, glitches and problems with these models get amplified and become nefarious when in the wrong hands. This article focuses on the problem of AI hallucinations in large language models (LLMs) and the potential cybersecurity threats they might pose.  

Body copy (needs to include keyword and keyphrase around 10 times, outbound link, be around 300-1000 words) 

What are AI Hallucinations? 

While it might seem like they’re really smart, generative language models work by using vast swaths of training data from the internet to essentially predict the next word in a sentence. The process usually works seamlessly for the latest models, like version 4 of ChatGPT, which produces consistent (if rather repetitive) answers to almost any question.  

However, there are still cases where these models spit out inaccurate results. These hallucinations are falsehoods produced by the underlying algorithm due to a range of possible factors, such as: 

  • Inherent Noisiness—Training data for generative language models comes from the internet, which has both accurate and inaccurate information. A model might learn and reproduce falsehoods based on this noise. 
  • Overgeneralization—Given the vast amount of information, the model might make overgeneralized statements in an attempt to provide a relevant answer. 
  • Model Complexity—Extremely large models have billions, or even trillions, of parameters. It’s challenging to fully understand and control how such a model will behave in every scenario. 
  • Overfitting—When the model doesn’t generalize well to new data.  

A high-profile example of AI hallucinations emerged when Google’s Bard chatbot messed up a response to a question about telescopes at the bot’s inaugural demo (incidentally, parent company Alphabet lost $100 billion in value after the mishap).  

How Could AI Hallucinations Cause Cyber Risks? 

While these hallucinations are annoying (and sometimes costly) for the companies that release generative language models, it’s perhaps not immediately clear how the falsehoods they sometimes produce could increase cybersecurity risks.  

A pertinent real-world example clarifies one type of risk: malicious code packages. Research from June 2023 highlighted a proof-of-concept attack that leveraged AI hallucinations to spread malicious code into developer environments.  

In the attack, researchers asked ChatGPT to recommend a package that could solve a coding problem. The model replied with recommendations for a package that didn’t exist (in other words, the response was hallucinated).  

A hypothetical threat actor could then go and create malicious code, use the title of this previously non-existent package, and publish it to a repository. Then, when a different user asks a similar question for package recommendations, they may end up installing malicious code in their own dev environment based on ChatGPT recommending the same non-existent package as in previous answers. 

While this so-called AI package hallucination sounds like a convoluted scenario, it is indeed quite plausible. A recent survey found that 92 percent of programmers use AI tools like ChatGPT in their work. So, it’s clearly not beyond the realm of possibility that hundreds of programmers ask these tools for library or other package recommendations each day.  

In fact, the methodology for this specific bit of research was to scrape the most frequently asked questions on Stack Overflow and see which responses involved hallucinated code packages. As developers move to generative AI models, they’ll likely be asking them the same questions they would’ve previously posted on Stack Overflow.  

AI hallucinations don’t necessarily have to involve malice on the part of threat actors to cause cybersecurity risks. Remember that the coding capability of generative language models comes from training data based on online public repositories and other unverified open source code. This means developers asking generative models to solve code problems could use insecure coding practices or introduce vulnerabilities in their codebases based on answers given by ChatGPT and other similar tools.  

It’s not just developers who use these tools, though; hallucinations’ cybersecurity risks don’t just relate to code. Consider the problem of compliance with data privacy or cybersecurity regulations. Someone involved in helping to maintain or achieve compliance could ask a generative AI model a question about a specific regulation.  

There could be cases where the model produces a hallucinated response because it simply doesn’t understand that specific regulation well enough. If the person who prompted the response then takes this advice and applies it in some way to your IT environment, the result could be non-compliance. The risk here is misinformation, with users relying on these models for important information.    

Tips to Reduce AI Hallucination Risks 

The initial inclination might be to ban these tools on your organization’s network, but that will lead to frustrated employees, workarounds and perhaps even a loss of competitiveness (because employees at other companies are likely using them). Instead, here are some tips to use AI more safely and avoid falling prey to risks from hallucinated responses: 

  • Stringently and regularly review code using both automated tools (e.g., static application security testing tools) and skilled developers acting as manual quality checkers. The extra time and effort to vet human or AI-generated code is well worth it.   
  • Update security awareness training to advise about AI hallucination risks. In general, recommend double-checking any information presented as factual by a generative model.  
  • Set boundaries by clearly defining the scope within which AI tools can be used at your company. Update your IT security policy in line with these boundaries. For tasks where precision and accuracy are critical, consider limiting the AI’s role or using it only as a secondary source of information.  
  • Alert users to the fact that their human judgment still usurps the output of seemingly amazing generative AI models. Communicate how employees might use AI to enhance productivity and creativity without becoming overly reliant on it.  

AI generative models can boost productivity, offer insights, and automate repetitive tasks. However the key to safely harnessing their potential lies in understanding their limitations and using them as a complementary tool rather than a definitive authority. 

Have you registered for our next event?