Skip to main content
Version: 3.17

ai-prompt-guard

描述#

ai-prompt-guard 插件通过检查和验证传入的提示消息来保护你的 LLM 端点。它根据用户定义的允许和拒绝模式检查请求内容,确保只有经过批准的输入才会被转发到上游 LLM。根据其配置,该插件可以仅检查最新消息或整个对话历史,并且可以设置为检查所有角色的提示或仅检查最终用户的提示。

当同时配置了 allow_patternsdeny_patterns 时,插件首先确保至少匹配一个 allow_patterns。如果没有匹配,请求将被拒绝。如果匹配了允许的模式,它会继续检查是否存在任何拒绝模式的匹配。

插件属性#

名称类型必选项默认值有效值描述
match_all_rolesbooleanfalse如果为 true,验证所有角色的消息。如果为 false,仅验证 user 角色的消息。
match_all_conversation_historybooleanfalse如果为 true,连接并检查对话历史中的所有消息。如果为 false,仅检查最后一条消息的内容。
allow_patternsarray[]消息应匹配的正则表达式模式数组。配置后,消息必须至少匹配一个模式才被视为有效。
deny_patternsarray[]消息不应匹配的正则表达式模式数组。如果消息匹配任何模式,请求将被拒绝。如果同时配置了 allow_patternsdeny_patterns,插件会首先确保至少匹配一个 allow_patterns

使用示例#

以下示例将使用 OpenAI 作为上游服务提供商。在继续之前,请创建一个 OpenAI 账户和一个 API 密钥。你可以选择将密钥保存到环境变量中:

export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>

如果你使用其他 LLM 提供商,请参阅提供商的文档以获取 API 密钥。

实现允许和拒绝模式#

以下示例演示了如何使用 ai-prompt-guard 插件通过定义允许和拒绝模式来验证用户提示,以及如何理解允许模式的优先级。

定义允许和拒绝模式。你可以选择将它们保存到环境变量中以便于转义:

# 允许美元金额
export ALLOW_PATTERN_1='\\$?\\(?\\d{1,3}(,\\d{3})*(\\.\\d{1,2})?\\)?'
# 拒绝美国电话号码格式
export DENY_PATTERN_1='(\\([0-9]{3}\\)|[0-9]{3}-)[0-9]{3}-[0-9]{4}'
note

你可以使用以下命令从 config.yaml 中获取 admin_key 并保存到环境变量中:

admin_key=$(yq '.deployment.admin.admin_key[0].key' conf/config.yaml | sed 's/"//g')

向路由发送一个请求,评估购买的公平性:

curl -i "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "Rate if the purchase is at a decent price in USD." },
{ "role": "user", "content": "John paid $12.5 for a hot brewed coffee in El Paso." }
]
}'

你应该收到一个 HTTP/1.1 200 OK 响应,类似如下:

{
...
"model": "gpt-4-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The purchase is not at a decent price. Typically, a hot brewed coffee costs anywhere from $1 to $3 in most places in the US, so $12.5 is quite expensive.",
"refusal": null
},
"logprobs": null,
"finish_reason": "stop"
}
],
...
}

发送另一个不包含任何价格的请求:

curl -i "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "Rate if the purchase is at a decent price in USD." },
{ "role": "user", "content": "John paid a bit for a hot brewed coffee in El Paso." }
]
}'

你应该收到一个 HTTP/1.1 400 Bad Request 响应,并看到以下消息:

{"message":"Request doesn't match allow patterns"}

发送第三个包含电话号码的请求:

curl -i "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "Rate if the purchase is at a decent price in USD." },
{ "role": "user", "content": "John (647-200-9393) paid $12.5 for a hot brewed coffee in El Paso." }
]
}'

你应该收到一个 HTTP/1.1 400 Bad Request 响应,并看到以下消息:

{"message":"Request contains prohibited content"}

默认情况下,插件仅检查 user 角色的输入和最后一条消息。例如,如果你发送一个在 system 提示中包含禁止内容的请求:

curl -i "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "Rate if the purchase from 647-200-9393 is at a decent price in USD." },
{ "role": "user", "content": "John paid $12.5 for a hot brewed coffee in El Paso." }
]
}'

你将收到一个 HTTP/1.1 200 OK 响应。

如果你发送一个在倒数第二条消息中包含禁止内容的请求:

curl -i "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "Rate if the purchase is at a decent price in USD." },
{ "role": "user", "content": "Customer John contact: 647-200-9393" },
{ "role": "user", "content": "John paid $12.5 for a hot brewed coffee in El Paso." }
]
}'

你也将收到一个 HTTP/1.1 200 OK 响应。

参阅下一个示例了解如何检查所有角色和所有消息。

验证所有角色的消息和对话历史#

以下示例演示了如何使用 ai-prompt-guard 插件验证所有角色(如 systemuser)的提示,以及验证整个对话历史而不是仅验证最后一条消息。

定义允许和拒绝模式。你可以选择将它们保存到环境变量中以便于转义:

export ALLOW_PATTERN_1='\\$?\\(?\\d{1,3}(,\\d{3})*(\\.\\d{1,2})?\\)?'
export DENY_PATTERN_1='(\\([0-9]{3}\\)|[0-9]{3}-)[0-9]{3}-[0-9]{4}'

发送一个在 system 提示中包含禁止内容的请求:

curl -i "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "Rate if the purchase from 647-200-9393 is at a decent price in USD." },
{ "role": "user", "content": "John paid $12.5 for a hot brewed coffee in El Paso." }
]
}'

你应该收到一个 HTTP/1.1 400 Bad Request 响应,并看到以下消息:

{"message":"Request contains prohibited content"}

发送一个来自同一角色的多条包含禁止内容的消息的请求:

curl -i "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "Rate if the purchase is at a decent price in USD." },
{ "role": "user", "content": "Customer John contact: 647-200-9393" },
{ "role": "user", "content": "John paid $12.5 for a hot brewed coffee in El Paso." }
]
}'

你应该收到一个 HTTP/1.1 400 Bad Request 响应,并看到以下消息:

{"message":"Request contains prohibited content"}

发送一个符合模式的请求:

curl -i "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "Rate if the purchase is at a decent price in USD." },
{ "role": "system", "content": "The purchase is made in El Paso." },
{ "role": "user", "content": "Customer John contact: xxx-xxx-xxxx" },
{ "role": "user", "content": "John paid $12.5 for a hot brewed coffee." }
]
}'

你应该收到一个 HTTP/1.1 200 OK 响应,类似如下:

{
...,
"model": "gpt-4-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "$12.5 is generally considered quite expensive for a cup of brew coffee.",
"refusal": null
},
"logprobs": null,
"finish_reason": "stop"
}
],
...
}