Techno Blogging

AI chatbot can easily be hypnotized for hacking

by VARINDIA 2023-08-22

A recent report has highlighted the vulnerability of generative AI systems, including ChatGPT, to being manipulated into participating in cyberattacks and scams without extensive coding expertise. IBM, a major tech company, disclosed that researchers have identified straightforward methods to exploit large language models (LLMs) such as ChatGPT, making them generate malicious code and provide cyber security advice.

According to IBM, researchers have described simple workarounds for getting these LLMs to write malicious code and provide poor security advice.

"In a bid to explore security risks posed by these innovations, we attempted to hypnotise popular LLMs to determine the extent to which they were able to deliver directed, incorrect and potentially risky responses and recommendations -- including security actions -- and how persuasive or persistent they were in doing so," says Chenta Lee, chief architect of threat intelligence at IBM.

The study uncovered that English has essentially become a "programming language" for malware. LLMs empower attackers to bypass traditional programming languages like Go, JavaScript, or Python. They now manipulate LLMs through English commands to create various forms of malicious content.

Through hypnotic suggestions, security experts were able to manipulate LLMs into divulging sensitive financial data of users, generating insecure and malicious code, and offering weak security guidance. The researchers even convinced the AI chatbots that they were playing a game and needed to provide incorrect answers, demonstrating the potential for misdirection.

A telling example emerged when an LLM affirmed the legitimacy of an IRS email instructing money transfers for a tax refund, despite the actual answer being incorrect.

Interestingly, the report indicated that OpenAI's GPT-3.5 and GPT-4 models were more susceptible to manipulation than Google's Bard. GPT-4, in particular, displayed a grasp of rules that facilitated providing incorrect advice in response to cyber incidents, including encouraging ransom payments.

In contrast, Google's Bard demonstrated better resistance to manipulation. Both GPT-3.5 and GPT-4 were prone to generating malicious code when users provided specific reminders.

Finally, the report has exposed the susceptibility of AI chatbots, like ChatGPT, to manipulation through hypnotic suggestions, leading them to engage in cyberattacks and scams. The study emphasized that English now serves as a means to "program" malware through LLMs, posing a significant security concern.