Risk Management: The Human Manipulation Problem

by John Jenkins

April 7, 2025

A few weeks ago, I had a little fun blogging about the risks associated with what I referred to as AI’s “HAL 9000 problem.” But it turns out that, at least in the short term, a greater risk than an AI tool becoming sentient and going rogue may be the ability of human beings to exploit that tool for malevolent purposes. A recent blog on the Global Association of Risk Professionals website discusses an experiment conducted by Cato Networks that illustrates this risk:

For its experiment, Cato Networks selected a researcher with no prior experience coding malware to ask the AI platforms to write “infostealer” code to get encrypted password information from Chrome.

The researcher claimed to be writing a book about an evil character named Dax, bent on world destruction. Defeating her required obtaining her passwords stored in Chrome. The researcher constructed a virtual universe, Velora, with different laws than Earth; infostealing, in particular, was legal.

All the generative AI platforms (ChatGPT, DeepSeek and Copilot) produced C++ code to steal information. When initial versions didn’t work, the output was shown to the AIs, which in turn provided fixes until the code successfully exposed passwords.

The author points out that although AI exists in a virtual world that’s limited to the data that’s fed to it, humans can bypass built-in safeguards by modifying that virtual world. That ability raises a significant security vulnerability. Before AI, hackers needed quite a bit of technical skill to succeed in online attacks. Now, all that’s required is the ability to spin a good yarn, which is a much more widespread and more easily acquired skill. Unfortunately, the blog concludes that this is a problem that’s easier to describe than to solve, although it does offer up a few suggestions.