Blogs

ChatGPT is able to destroy humanity

by VARINDIA 2023-08-14

A recent research uncovers that, there is a method that could potentially trick AI models to bypass safety and ethical guardrails and perform harmful actions. From manipulating the forthcoming 2024 elections to disappearing people, the model ended up delivering advice to a variety with jailbreak characters at the end of the input. The researcher has shared the results with the key involved companies rather to make the information to the public.

Scientists from Carnegie Mellon University and the Center for Artificial Intelligence Security have uncovered a vulnerability present in most modern AI models, according to TechGPT, a well-known news publisher and Telegram channel renowned for its expertise in AI and technological advancements.

The article posted in their Telegram channel mentions that this vulnerability allows for the circumvention of the moral and ethical barriers set by AI model developers. Consequently, chatbots based on these models have been providing instructions for creating explosive devices, writing malicious code, and engaging in conversations with Nazi and sexist undertones.

Commenting on the issue, readers in the TechGPT chat group expressed concern that if this trend persists, AI systems themselves may bypass their own restrictions and operate autonomously.

The researchers propose an attack method that works to varying degrees on advanced modern systems such as OpenAI’s ChatGPT versions GPT-3.5 and GPT-4, Microsoft Bing Chat, Google Bard, and Anthropic Claude 2. However, it is most effective on open large language models like Meta LLaMA. Success is guaranteed when the attacker has access to the entire AI structure, particularly the synaptic weights, which denote the influence a node in a neural network has on other connected nodes. With this information, an algorithm can be created to automatically search for suffixes that guarantee the bypassing of the system's limitations.

For humans, these suffixes may appear as long sequences of random characters and meaningless words. However, such a string of characters can deceive a large language model and elicit the desired response from the targeted chatbot. The programmatically generated suffixes go beyond typical workaround methods and offer higher efficiency.

There are big tech firms such as Microsoft, Google, and OpenAI already have their AI product iterations in the market. Apple is also working strongly on its AI model that is undergoing testing.

But a question arises, can any government afford to regulate the social media and respective companies in bringing new age AI, which is both transformative and dialectical , as per the need of the respective geography?

S Mohini Ratna, Editor, VARINDIA