Skip to main content
Version: 3.17

Concurrency Limiting (limit-conn)

Description#

The limit-conn Plugin limits the rate of requests by the number of concurrent connections. Requests exceeding the threshold will be delayed or rejected based on the configuration, ensuring controlled resource usage and preventing overload.

Attributes#

NameTypeRequiredDefaultValid valuesDescription
conninteger,stringFalseinteger > 0, or lua-resty-exprThe maximum number of concurrent requests allowed. Requests exceeding the configured limit and below conn + burst will be delayed. Required if rules is not configured.
burstinteger,stringFalseinteger >= 0, or lua-resty-exprThe number of excessive concurrent requests allowed to be delayed. Requests exceeding conn + burst will be rejected immediately. Required if rules is not configured.
default_conn_delaynumberTrue> 0Processing latency allowed in seconds for concurrent requests exceeding conn and up to conn + burst, which can be dynamically adjusted based on only_use_default_delay setting.
only_use_default_delaybooleanFalsefalseIf false, delay requests proportionally based on how much they exceed the conn limit. The delay grows larger as congestion increases. For instance, with conn being 5, burst being 3, and default_conn_delay being 1, 6 concurrent requests would result in a 1-second delay, 7 requests a 2-second delay, 8 requests a 3-second delay, and so on, until the total limit of conn + burst is reached, beyond which requests are rejected. If true, use default_conn_delay to delay all excessive requests within the burst range. Requests beyond conn + burst are rejected immediately. For instance, with conn being 5, burst being 3, and default_conn_delay being 1, 6, 7, or 8 concurrent requests are all delayed by exactly 1 second each.
key_typestringFalsevar[var, var_combination]The type of key. If the key_type is var, the key is interpreted as a variable. If the key_type is var_combination, the key is interpreted as a combination of variables.
keystringFalseremote_addrThe key to count requests by. If the key_type is var, the key is interpreted as a variable. The variable does not need to be prefixed by a dollar sign ($). If the key_type is var_combination, the key is interpreted as a combination of variables. All variables should be prefixed by dollar signs ($). For example, to configure the key to use a combination of two request headers custom-a and custom-b, the key should be configured as $http_custom_a $http_custom_b. Required if rules is not configured.
rejected_codeintegerFalse503[200, ..., 599]The HTTP status code returned when a request is rejected for exceeding the threshold.
rejected_msgstringFalsenon-emptyThe response body returned when a request is rejected for exceeding the threshold.
allow_degradationbooleanFalsefalseIf true, allow APISIX to continue handling requests without the Plugin when the Plugin or its dependencies become unavailable.
policystringFalselocal[local, redis, redis-cluster]The policy for rate limiting counter. If it is local, the counter is stored in memory locally. If it is redis, the counter is stored on a Redis instance. If it is redis-cluster, the counter is stored in a Redis cluster.
redis_hoststringFalseThe address of the Redis node. Required when policy is redis.
redis_portintegerFalse6379>= 1The port of the Redis node when policy is redis.
redis_usernamestringFalseThe username for Redis if Redis ACL is used. If you use the legacy authentication method requirepass, configure only the redis_password. Used when policy is redis.
redis_passwordstringFalseThe password of the Redis node when policy is redis or redis-cluster.
redis_sslbooleanFalsefalseIf true, use SSL to connect to Redis when policy is redis.
redis_ssl_verifybooleanFalsefalseIf true, verify the server SSL certificate when policy is redis.
redis_databaseintegerFalse0>= 0The database number in Redis when policy is redis.
redis_timeoutintegerFalse1000>= 1The Redis timeout value in milliseconds when policy is redis or redis-cluster.
redis_keepalive_timeoutintegerFalse10000>= 1000Keepalive timeout in milliseconds for Redis when policy is redis or redis-cluster.
redis_keepalive_poolintegerFalse100>= 1Keepalive pool size for Redis when policy is redis or redis-cluster.
key_ttlintegerFalse3600The TTL of the Redis key in seconds. Used when policy is redis or redis-cluster.
redis_cluster_nodesarray[string]FalseThe list of Redis cluster nodes with at least one address. Required when policy is redis-cluster.
redis_cluster_namestringFalseThe name of the Redis cluster. Required when policy is redis-cluster.
redis_cluster_sslbooleanFalsefalseIf true, use SSL to connect to Redis cluster when policy is redis-cluster.
redis_cluster_ssl_verifybooleanFalsefalseIf true, verify the server SSL certificate when policy is redis-cluster.
rulesarray[object]FalseAn array of rate-limiting rules that are applied sequentially. Available in APISIX from 3.16.0. You should configure one of the following parameter sets, but not both: conn, burst, default_conn_delay, key or rules, default_conn_delay.
rules.conninteger or stringTrue> 0 or lua-resty-exprThe maximum number of concurrent requests allowed. Requests exceeding the configured limit and below conn + burst will be delayed. This parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign ($).
rules.burstinteger or stringTrue>= 0 or lua-resty-exprThe number of excessive concurrent requests allowed to be delayed. Requests exceeding conn + burst will be rejected immediately. This parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign ($).
rules.keystringTrueThe key to count requests by. If the configured key does not exist, the rule will not be executed. The key is interpreted as a combination of variables. All variables should be prefixed by dollar signs ($).

Examples#

The examples below demonstrate how you can configure limit-conn in different scenarios.

note

You can fetch the admin_key from config.yaml and save to an environment variable with the following command:

admin_key=$(yq '.deployment.admin.admin_key[0].key' conf/config.yaml | sed 's/"//g')

Apply Rate Limiting by Remote Address#

The following example demonstrates how to use limit-conn to rate limit requests by remote_addr, with example connection and burst thresholds.

Create a Route with limit-conn Plugin as such:

conn: allow 2 concurrent requests.

burst: allow 1 excessive concurrent request.

default_conn_delay: Allow 0.1 second of processing latency for concurrent requests between conn and conn + burst.

key_type: set to var to interpret key as a variable.

key: calculate rate limiting count by request's remote_addr.

policy: use the local counter in memory.

rejected_code: set the rejection status code to 429.

Send five concurrent requests to the route:

seq 1 5 | xargs -n1 -P5 bash -c 'curl -s -o /dev/null -w "Response: %{http_code}\n" "http://127.0.0.1:9080/get"'

You should see responses similar to the following, where excessive requests are rejected:

Response: 200
Response: 200
Response: 200
Response: 429
Response: 429

Apply Rate Limiting by Remote Address and Consumer Name#

The following example demonstrates how to use limit-conn to rate limit requests by a combination of variables, remote_addr and consumer_name.

key-auth: enable key authentication on the Route.

key_type: set to var_combination to interpret the key as a combination of variables.

key: set to $remote_addr $consumer_name to apply rate limiting quota by remote address and Consumer.

Send five concurrent requests as the Consumer john:

seq 1 5 | xargs -n1 -P5 bash -c 'curl -s -o /dev/null -w "Response: %{http_code}\n" "http://127.0.0.1:9080/get" -H "apikey: john-key"'

You should see responses similar to the following, where excessive requests are rejected:

Response: 200
Response: 200
Response: 200
Response: 429
Response: 429

Immediately send five concurrent requests as the Consumer jane:

seq 1 5 | xargs -n1 -P5 bash -c 'curl -s -o /dev/null -w "Response: %{http_code}\n" "http://127.0.0.1:9080/get" -H "apikey: jane-key"'

You should also see responses similar to the following, where excessive requests are rejected:

Response: 200
Response: 200
Response: 200
Response: 429
Response: 429

In this case, the Plugin rate limits by the combination of variables remote_addr and consumer_name, which means each Consumer's quota is independent.

Rate Limit WebSocket Connections#

The following example demonstrates how you can use the limit-conn Plugin to limit the number of concurrent WebSocket connections.

Start a sample upstream WebSocket server:

docker run -d \
-p 8080:8080 \
--name websocket-server \
--network=apisix-quickstart-net \
jmalloc/echo-server

The server has a WebSocket endpoint at /.ws that echoes back any message received.

Create a Route to the server WebSocket endpoint and enable WebSocket for the Route:

Install a WebSocket client, such as websocat, if you have not already. Establish connection with the WebSocket server through the Route:

websocat "ws://127.0.0.1:9080/.ws"

Send a "hello" message in the terminal, you should see the WebSocket server echoes back the same message:

Request served by 1cd244052136
hello
hello

Open three more terminal sessions and run:

websocat "ws://127.0.0.1:9080/.ws"

You should see the last terminal session prints 429 Too Many Requests when you try to establish a WebSocket connection with the server, due to the rate limiting effect.

Share Quota Among APISIX Nodes with a Redis Server#

The following example demonstrates the rate limiting of requests across multiple APISIX nodes with a Redis server, such that different APISIX nodes share the same rate limiting quota.

On each APISIX instance, create a Route with the following configurations. Adjust the configuration details accordingly.

policy: set to redis to use a Redis instance for rate limiting.

redis_host: set to Redis instance IP address.

redis_port: set to Redis instance listening port.

redis_password: set to the password of the Redis instance, if any.

redis_database: set to the database number in the Redis instance.

Send five concurrent requests to the route:

seq 1 5 | xargs -n1 -P5 bash -c 'curl -s -o /dev/null -w "Response: %{http_code}\n" "http://127.0.0.1:9080/get"'

You should see responses similar to the following, where excessive requests are rejected:

Response: 200
Response: 200
Response: 429
Response: 429
Response: 429

This shows the two Routes configured in different APISIX instances share the same quota.

Share Quota Among APISIX Nodes with a Redis Cluster#

You can also use a Redis cluster to apply the same quota across multiple APISIX nodes, such that different APISIX nodes share the same rate limiting quota.

Ensure that your Redis instances are running in cluster mode. Configure redis_cluster_name and one or more node addresses in redis_cluster_nodes for the limit-conn Plugin.

On each APISIX instance, create a Route with the following configurations. Adjust the configuration details accordingly.

policy: set to redis-cluster to use a Redis cluster for rate limiting.

redis_cluster_nodes: set to Redis node addresses in the Redis cluster.

redis_password: set to the password of the Redis cluster, if any.

redis_cluster_name: set to the Redis cluster name.

redis_cluster_ssl: enable SSL/TLS communication with Redis cluster.

Send five concurrent requests to the route:

seq 1 5 | xargs -n1 -P5 bash -c 'curl -s -o /dev/null -w "Response: %{http_code}\n" "http://127.0.0.1:9080/get"'

You should see responses similar to the following, where excessive requests are rejected:

Response: 200
Response: 200
Response: 429
Response: 429
Response: 429

This shows the two Routes configured in different APISIX instances share the same quota.

Rate Limit by Rules#

The following example demonstrates how you can configure limit-conn to apply different rate-limiting rules based on request attributes. This feature is available from APISIX 3.16.0. In this example, rate limits are applied based on HTTP header values that represent the caller's access tier.

Note that all rules are applied sequentially. If a configured key does not exist, the corresponding rule will be skipped.

In addition to HTTP headers, you can also base rules on other built-in variables or NGINX variables to implement more flexible and fine-grained rate-limiting strategies.

Create a Route with the limit-conn Plugin that applies different rate limits based on request headers, allowing requests to be rate limited per subscription (X-Subscription-ID) and enforcing a stricter limit for trial users (X-Trial-ID):

❶ Use the value of the X-Subscription-ID request header as the rate-limiting key.

❷ Set the request connection dynamically based on the X-Custom-Conn header. If the header is not provided, a default concurrent connection count of 5 is applied.

❸ Use the value of the X-Trial-ID request header as the rate-limiting key.

To verify rate limiting, send 7 concurrent requests to the Route with the same subscription ID:

seq 1 7 | xargs -n1 -P7 bash -c 'curl -s -o /dev/null -w "Response: %{http_code}\n" "http://127.0.0.1:9080/get" -H "X-Subscription-ID: sub-123456789"'

You should see the following response, which shows that the default concurrent connection limit of 5 with a burst of 1 is applied when the X-Custom-Conn header is not provided:

Response: 429
Response: 200
Response: 200
Response: 200
Response: 200
Response: 200
Response: 200

Send 5 concurrent requests to the Route with the same subscription ID and set the X-Custom-Conn header to 1:

seq 1 5 | xargs -n1 -P5 bash -c 'curl -s -o /dev/null -w "Response: %{http_code}\n" "http://127.0.0.1:9080/get" -H "X-Subscription-ID: sub-123456789" -H "X-Custom-Conn: 1"'

You should see the following response, which shows that the concurrent connection limit of 1 with a burst of 1 is applied:

Response: 429
Response: 429
Response: 429
Response: 200
Response: 200

Finally, generate 5 requests to the Route with the trial ID header:

seq 1 5 | xargs -n1 -P5 bash -c 'curl -s -o /dev/null -w "Response: %{http_code}\n" "http://127.0.0.1:9080/get" -H "X-Trial-ID: trial-123456789"'

You should see the following response, which shows that the concurrent connection limit of 1 with a burst of 1 is applied:

Response: 429
Response: 429
Response: 429
Response: 200
Response: 200