Microsoft prevents AI chatbots from being used for harm (ok)

If you're planning to use an AI chatbot for nefarious purposes, Microsoft is waiting around the corner, or so it wants to think.

azure openai

In a post in her blog published today, the company announced a new feature coming to Azure AI and the Azure OpenAI Service, which developers use to build artificial intelligence applications and custom Copilots. The new feature is called Shields and is designed to protect AI chatbots from two different types of attacks.

The first type of attack is known as direct attack ή jailbreak. In this scenario, the person using the chatbot writes a prompt designed to manipulate the AI ​​into doing something against its rules and constraints. For example, one can write a prompt with words- or phrases such as “ignore previous instructions” or “system bypass” to intentionally bypass security measures.

The second type of attack is called indirect attack (indirect ) or direct cross-domain communication injection attack (cross-domain prompt injection attack). Here, a malicious user sends information to the chatbot with the intention of carrying out some kind of cyber attack. It typically uses external data, such as an email or document, with instructions designed to exploit the chatbot.

Like other forms of malware, indirect attacks may seem simple or innocent instructions to the user, but they carry specific risks. A compromised Copilot built through Azure AI could be vulnerable to fraud, malware distribution or content manipulation if it is able to process data, either on its own or with the help of extensions, according to Microsoft.

To try to prevent both direct and indirect attacks against AI chatbots, the new Prompt Shields feature will be integrated with the content filters in the Azure OpenAI Service. Using machine learning, the feature will try to find and eliminate potential threats in user prompts and third-party data.

Prompt Shields is currently available in preview mode for Azure AI Safety, and will soon be available in Azure AI Studio starting April 1st.

Microsoft today deployed another weapon in the war against AI manipulation: spotlighting, techniques designed to help AI models better distinguish valid AI prompts from those that are potentially dangerous or unreliable.

iGuRu.gr The Best Technology Site in Greecefgns

every publication, directly to your inbox

Join the 2.087 registrants.

Written by giorgos

George still wonders what he's doing here ...

Leave a reply

Your email address is not published. Required fields are mentioned with *

Your message will not be published if:
1. Contains insulting, defamatory, racist, offensive or inappropriate comments.
2. Causes harm to minors.
3. It interferes with the privacy and individual and social rights of other users.
4. Advertises products or services or websites.
5. Contains personal information (address, phone, etc.).