Skip to main content

· One min read

An API gateway is a server that sits between clients and backend services, acting as the single entry point for all API traffic. It accepts incoming requests, applies policies such as authentication, rate limiting, and transformation, then routes each request to the appropriate upstream service and returns the response to the caller.

In practice, an API gateway consolidates cross-cutting concerns that would otherwise be duplicated across every microservice: access control, traffic shaping, observability, and protocol translation. Instead of embedding this logic in each service, teams centralize it at the gateway layer, reducing code duplication, simplifying deployments, and giving platform teams a single control plane for governing API behavior at scale.

How Does an API Gateway Work?#

The request lifecycle through an API gateway follows a well-defined pipeline:

  1. Client sends a request. A mobile app, browser, or upstream service issues an HTTP/HTTPS request to the gateway's public endpoint. The client never communicates directly with individual backend services.

  2. Gateway evaluates policies. The gateway inspects the incoming request and runs it through a chain of plugins or middleware. This typically includes validating authentication tokens (JWT, OAuth 2.0, API keys), enforcing rate limits, checking IP allowlists, and applying request transformations such as header injection or body rewriting.

  3. Gateway routes to the upstream. Based on the request path, host header, or other matching criteria, the gateway selects a target upstream service. If multiple instances are registered, the gateway applies a load-balancing algorithm (round-robin, least connections, consistent hashing) to pick a specific node.

  4. Backend processes the request. The upstream service handles the business logic and returns a response to the gateway.

  5. Gateway processes the response. Before forwarding the response to the client, the gateway can apply response transformations, inject CORS headers, compress the payload, or cache the result for subsequent identical requests.

  6. Gateway returns the response. The final response reaches the client with appropriate status codes, headers, and payload. Throughout this entire cycle, the gateway emits metrics, access logs, and traces that feed into observability systems.

This pipeline executes in milliseconds. High-performance gateways like Apache APISIX complete it in under 1ms of added latency, making the overhead negligible even for latency-sensitive workloads.

Key Features of an API Gateway#

A production-grade API gateway provides a broad surface of capabilities. The following features represent the core functionality that distinguishes a gateway from a simple reverse proxy.

Request Routing#

The gateway matches incoming requests to upstream services based on URI paths, HTTP methods, headers, query parameters, or custom expressions. Advanced gateways support regex-based matching, wildcard routes, and priority-weighted rules. Apache APISIX supports radixtree-based routing that scales efficiently even with thousands of route entries.

Load Balancing#

Distributing traffic across service instances prevents hotspots and improves availability. Gateways typically support round-robin, weighted round-robin, least connections, consistent hashing, and EWMA (exponentially weighted moving average) algorithms. Health checks --- both active probes and passive failure detection --- automatically remove unhealthy nodes from the upstream pool.

Authentication and Authorization#

Centralizing identity verification at the gateway eliminates the need for each service to implement its own auth stack. Common mechanisms include JWT validation, OAuth 2.0 token introspection, HMAC signatures, LDAP, and API key authentication. Some gateways also integrate with external identity providers through OpenID Connect.

Rate Limiting#

Rate limiting protects backend services from traffic spikes, abusive clients, and cascading failures. Gateways enforce limits at multiple granularities: per consumer, per route, per IP, or globally. Apache APISIX provides configurable rate limiting plugins that support both fixed-window and leaky-bucket algorithms, with shared counters across gateway nodes via Redis.

Caching#

Response caching at the gateway layer reduces backend load and improves latency for read-heavy endpoints. Gateways cache responses based on configurable TTLs, cache keys (URI, headers, query strings), and invalidation rules. For APIs serving relatively static data --- product catalogs, configuration endpoints, reference data --- caching can reduce upstream requests by 80% or more.

Request and Response Transformation#

Gateways can rewrite requests before they reach the backend and transform responses before they reach the client. This includes header manipulation, body rewriting, protocol translation (HTTP to gRPC, REST to GraphQL), and payload format conversion. Transformation eliminates the need for adapter services and simplifies API versioning.

Monitoring and Observability#

A gateway sees every request, making it the natural instrumentation point for API metrics. Production gateways export access logs, request/response latencies (P50, P95, P99), error rates, and throughput to systems like Prometheus, Datadog, and OpenTelemetry collectors. Apache APISIX ships with built-in integrations for Prometheus, Grafana, SkyWalking, and Zipkin.

SSL/TLS Termination#

The gateway handles TLS handshakes, certificate management, and encryption offloading so that backend services can communicate over plain HTTP internally. This simplifies certificate rotation, centralizes security policy, and reduces CPU overhead on upstream services. Modern gateways also support mTLS for service-to-service authentication.

Circuit Breaking#

When a backend service becomes degraded or unresponsive, a circuit breaker at the gateway stops forwarding requests to it, preventing cascading failures across the system. After a configurable cooldown, the gateway sends probe requests to test recovery. This pattern is critical in microservices architectures where a single failing service can take down an entire request chain.

API Versioning and Canary Releases#

Gateways can route a percentage of traffic to new service versions, enabling canary deployments and blue-green releases without infrastructure changes. Traffic-splitting rules let teams gradually shift load from v1 to v2, monitor error rates, and roll back instantly if metrics degrade.

API Gateway vs Load Balancer vs Reverse Proxy#

These three components overlap in functionality but serve different primary purposes:

CapabilityReverse ProxyLoad BalancerAPI Gateway
Request forwardingYesYesYes
SSL terminationYesSometimesYes
Load balancingBasicAdvancedAdvanced
Health checksLimitedYesYes
AuthenticationNoNoYes
Rate limitingNoNoYes
Request transformationNoNoYes
API-aware routingNoNoYes
Response cachingYesNoYes
Observability/metricsBasicBasicComprehensive
Protocol translationNoNoYes
Plugin/middleware ecosystemLimitedNoExtensive

A reverse proxy (e.g., Nginx, HAProxy in proxy mode) forwards client requests to backend servers, provides SSL termination, and can cache static content. It operates at the HTTP level but lacks API-specific intelligence.

A load balancer (e.g., AWS ALB, HAProxy, Envoy in LB mode) distributes traffic across server instances using health checks and balancing algorithms. Layer 4 load balancers work at the TCP level; Layer 7 load balancers can inspect HTTP headers but still lack API-layer logic like authentication or transformation.

An API gateway builds on reverse proxy and load balancing capabilities but adds an API-aware policy layer: authentication, rate limiting, request/response transformation, observability, and developer portal integration. It is purpose-built for managing API traffic.

In practice, many organizations start with a reverse proxy or load balancer and later adopt an API gateway as their API surface grows. Some gateways, including Apache APISIX, are built on top of proven proxies (APISIX uses Nginx and OpenResty) and inherit their performance characteristics while adding the API management layer.

API Gateway Use Cases#

Microservices Architecture#

In a microservices system with dozens or hundreds of services, an API gateway provides the single entry point that abstracts internal service topology from external consumers. Clients interact with one stable endpoint; the gateway handles service discovery, routing, and cross-cutting concerns. Without a gateway, each client must know the location and protocol of every service, creating tight coupling and operational fragility.

Mobile and IoT Backends#

Mobile clients operate under bandwidth, latency, and battery constraints that differ significantly from desktop browsers. An API gateway can aggregate multiple backend calls into a single response (the Backend-for-Frontend pattern), compress payloads, and adapt protocols. For IoT devices that may use MQTT or CoAP, the gateway translates between device protocols and internal HTTP/gRPC services.

Multi-Cloud and Hybrid Deployments#

Organizations running services across AWS, GCP, Azure, and on-premises data centers use an API gateway as the unified traffic layer. The gateway abstracts the underlying infrastructure, enabling consistent routing, security policies, and observability regardless of where a service is deployed. This is especially valuable during cloud migration, where services move between environments incrementally.

API Monetization#

Companies that expose APIs as products --- payment processors, data providers, communication platforms --- use gateways to enforce usage tiers, track consumption per API key, and generate billing data. Rate limiting by tier, quota enforcement, and detailed usage analytics are all gateway responsibilities in this model.

Zero-Trust Security#

A gateway enforces authentication and authorization at the network edge, ensuring that no unauthenticated request reaches backend services. Combined with mTLS for internal traffic, IP allowlisting, and WAF integration, the gateway becomes a core component of a zero-trust architecture. It can also mask or redact sensitive fields in responses to prevent data leakage.

Legacy System Modernization#

When migrating from monolithic to microservices architectures, an API gateway acts as the facade in the strangler fig pattern. New services are deployed behind the gateway alongside the legacy monolith. The gateway gradually shifts traffic from old endpoints to new ones, allowing incremental migration without disrupting existing clients.

Benefits of Using an API Gateway#

Simplified Client Integration#

Clients interact with a single, well-documented endpoint instead of tracking the addresses and protocols of individual services. This reduces client-side complexity, eliminates service discovery logic in front-end code, and makes API consumption predictable.

Centralized Security#

Authentication, authorization, encryption, and threat detection are enforced at one layer rather than reimplemented in every service. A single policy change at the gateway propagates instantly across all APIs. This consistency eliminates the security gaps that emerge when individual teams implement auth differently.

Operational Visibility#

Because every request passes through the gateway, teams gain comprehensive metrics, access logs, and distributed traces without instrumenting each service individually. Dashboards built on gateway telemetry provide real-time visibility into traffic patterns, error rates, and latency distributions across the entire API surface.

Reduced Backend Load#

Caching, request deduplication, and rate limiting at the gateway layer prevent unnecessary calls from reaching backend services. This directly reduces infrastructure costs and improves system stability during traffic spikes. For read-heavy APIs, gateway caching alone can cut upstream load by an order of magnitude.

Faster Time to Market#

Developers focus on business logic rather than reimplementing cross-cutting concerns. Adding authentication to a new service takes a single plugin configuration at the gateway instead of weeks of development. Teams ship faster because infrastructure concerns are already solved.

Independent Scalability#

The gateway and backend services scale independently. During a traffic surge, teams can horizontally scale the gateway layer without modifying any backend service. Conversely, backend services can be scaled, redeployed, or replaced without any client-facing changes.

How Apache APISIX Works as an API Gateway#

Apache APISIX is a high-performance, cloud-native API gateway built on Nginx and LuaJIT. It is designed for environments where throughput, latency, and extensibility are critical requirements.

Performance at scale. APISIX handles over 18,000 requests per second per CPU core with a median latency of 0.2ms. This performance comes from its non-blocking, event-driven architecture and the efficiency of LuaJIT-compiled plugin execution. For comparison, this throughput exceeds most Java- and Go-based gateways by a significant margin.

Extensive plugin ecosystem. APISIX ships with over 100 built-in plugins covering authentication (JWT, OAuth, LDAP, OpenID Connect), traffic control (rate limiting, circuit breaking, traffic mirroring), observability (Prometheus, SkyWalking, OpenTelemetry), and transformation (gRPC transcoding, request rewriting, response rewriting). Plugins can also be written in Lua, Java, Go, Python, or WebAssembly.

Dynamic configuration. Unlike traditional gateways that require restarts for configuration changes, APISIX reloads routes, upstreams, and plugin configurations in real time through its Admin API. This enables zero-downtime deployments and makes APISIX well-suited for CI/CD pipelines and GitOps workflows.

Proven adoption. APISIX powers over 147,000 deployments across more than 5,200 companies globally, spanning industries from fintech to telecommunications. Its Apache Software Foundation governance ensures vendor-neutral, community-driven development.

To get started with APISIX, see the getting started guide.

Frequently Asked Questions#

What is the difference between an API gateway and a load balancer?#

A load balancer distributes incoming traffic across multiple server instances using algorithms like round-robin or least connections. It operates at the network or transport layer (L4) or HTTP layer (L7) but does not understand API semantics. An API gateway performs load balancing as one of many functions, and adds API-specific capabilities: authentication, rate limiting, request transformation, caching, and observability. If you only need to distribute traffic, a load balancer suffices. If you need to manage, secure, and observe API traffic, you need a gateway.

Do I need an API gateway for a monolithic application?#

An API gateway is not strictly required for a monolith, but it can still add value. If your monolith exposes APIs consumed by external clients, mobile apps, or third-party integrators, a gateway provides centralized authentication, rate limiting, and monitoring without modifying the application. It also positions your architecture for incremental migration to microservices using the strangler fig pattern.

How does an API gateway affect latency?#

A well-implemented gateway adds minimal latency --- typically 0.2ms to 2ms per request depending on the number of active plugins. High-performance gateways like Apache APISIX are optimized for sub-millisecond overhead. The latency tradeoff is almost always worthwhile: the gateway eliminates redundant auth checks, reduces backend calls through caching, and prevents cascading failures through circuit breaking, all of which improve overall system response times.

Can an API gateway replace a service mesh?#

An API gateway and a service mesh serve different layers. The gateway handles north-south traffic (external clients to internal services), while a service mesh manages east-west traffic (service-to-service communication within the cluster). They are complementary, not competing, technologies. Some organizations use APISIX as both a gateway and an ingress controller, bridging the two layers, but a full service mesh (Istio, Linkerd) addresses concerns like mutual TLS between services and fine-grained internal traffic policies that fall outside a gateway's scope.

Is an API gateway the same as an API management platform?#

No. An API gateway is the runtime component that processes API traffic. An API management platform is a broader category that typically includes a gateway, a developer portal, API documentation tools, lifecycle management, and analytics dashboards. The gateway is the engine; the management platform is the full vehicle. Apache APISIX provides the high-performance gateway layer, and organizations often pair it with additional tooling for the complete API management lifecycle.

· One min read

gRPC is a high-performance, open-source remote procedure call (RPC) framework originally developed by Google. It uses Protocol Buffers for binary serialization and HTTP/2 for transport, enabling strongly typed service contracts, bidirectional streaming, and significantly smaller payload sizes compared to equivalent JSON over REST.

Why gRPC Exists#

REST has dominated API design for over fifteen years, and it remains an excellent choice for public-facing, resource-oriented APIs. However, as microservices architectures scaled into hundreds or thousands of inter-service calls per request, the limitations of REST became measurable: text-based JSON serialization consumes CPU cycles, HTTP/1.1 head-of-line blocking limits concurrency, and the lack of a formal contract language leads to integration drift.

Google developed gRPC internally (as Stubby) and open-sourced it in 2015. Adoption has grown steadily, and gRPC has become a common choice for latency-sensitive internal APIs in performance-critical systems.

How gRPC Works#

Protocol Buffers (Protobuf)#

Protocol Buffers are gRPC's interface definition language (IDL) and serialization format. A .proto file defines the service contract, including methods, request types, and response types:

syntax = "proto3";
service OrderService {  rpc GetOrder (OrderRequest) returns (OrderResponse);  rpc StreamOrders (OrderFilter) returns (stream OrderResponse);}
message OrderRequest {  string order_id = 1;}
message OrderResponse {  string order_id = 1;  string status = 2;  double total = 3;}

The protoc compiler generates client and server code in many languages from this single definition. Binary serialization produces payloads that are substantially smaller than equivalent JSON representations. This size reduction directly translates to lower network bandwidth consumption and faster serialization/deserialization.

HTTP/2 Transport#

gRPC runs exclusively on HTTP/2, which provides several performance advantages over HTTP/1.1:

  • Multiplexing. Multiple RPC calls share a single TCP connection without head-of-line blocking. A service making 50 concurrent calls to another service needs only one connection, not 50.
  • Header compression. HPACK compression significantly reduces header overhead for repeated headers.
  • Binary framing. HTTP/2 frames are binary, eliminating the text parsing overhead of HTTP/1.1.

These transport-level improvements compound with Protobuf serialization to deliver measurably lower latency in service-to-service communication.

Streaming Modes#

gRPC supports four communication patterns:

  1. Unary RPC. Single request, single response. Equivalent to a REST call.
  2. Server streaming. Client sends one request, server returns a stream of responses. Useful for real-time feeds or large result sets.
  3. Client streaming. Client sends a stream of messages, server returns one response. Useful for batched uploads or telemetry ingestion.
  4. Bidirectional streaming. Both sides send streams of messages concurrently. Useful for chat, collaborative editing, or real-time synchronization.

In practice, unary calls represent the majority of gRPC usage, with server streaming being the next most common pattern. The streaming capabilities differentiate gRPC from REST most sharply in real-time and high-throughput scenarios.

gRPC vs REST Comparison#

AspectgRPCREST
SerializationProtocol Buffers (binary)JSON (text)
TransportHTTP/2 onlyHTTP/1.1 or HTTP/2
Contract.proto file (strict)OpenAPI/Swagger (optional)
StreamingNative (4 modes)Limited (SSE, WebSocket)
Code GenerationBuilt-in (protoc)Third-party tools
Browser SupportRequires proxy (gRPC-Web)Native
Payload SizeSignificantly smallerBaseline
Latency (typical)Lower inter-serviceHigher inter-service
Human ReadabilityBinary (needs tooling)JSON is human-readable
CachingNot HTTP-cacheable by defaultHTTP caching built-in
Tooling MaturityGrowingExtensive

REST remains the dominant choice for public-facing APIs, while gRPC is increasingly preferred for internal microservices communication at larger organizations. The two protocols serve complementary roles rather than competing directly.

When to Use gRPC#

Use gRPC when:

  • Services communicate with high frequency and low latency requirements (trading systems, real-time analytics, game backends).
  • Payload efficiency matters because of bandwidth constraints or high message volumes.
  • Strong typing and contract-first development are priorities. The .proto file becomes the single source of truth.
  • Streaming is a core requirement (live data feeds, event-driven architectures, IoT telemetry).
  • Polyglot environments need consistent client/server code generation across multiple languages.

Stick with REST when:

  • The API is public-facing and must be browser-accessible without additional proxying.
  • Human readability and debuggability with standard HTTP tools (curl, Postman) are important for developer experience.
  • HTTP caching semantics are essential for performance.
  • The team's existing tooling and expertise are REST-centric, and migration cost outweighs the performance gain.

Many organizations adopting gRPC maintain REST for external APIs and use gRPC exclusively for internal communication, creating a dual-protocol architecture that leverages each protocol's strengths.

gRPC and API Gateways#

API gateways play a critical role in gRPC architectures by solving three problems: protocol translation, traffic management, and observability.

gRPC Proxying#

A gateway that natively supports HTTP/2 can proxy gRPC traffic directly, applying authentication, rate limiting, and logging without protocol translation. The gateway terminates the client's gRPC connection, applies policies, and forwards the call to the upstream gRPC service. This is the simplest integration model and preserves full gRPC semantics including streaming.

gRPC-Web Translation#

Browsers cannot make native gRPC calls because browser-based JavaScript does not expose the HTTP/2 framing layer required by gRPC. The gRPC-Web protocol bridges this gap: the browser sends gRPC-Web requests (HTTP/1.1 or HTTP/2 with modified framing), and the gateway translates them into native gRPC for the upstream service. This eliminates the need for a separate REST API layer for browser clients.

HTTP/JSON to gRPC Transcoding#

Many organizations need to expose gRPC services to clients that can only consume REST/JSON. An API gateway with transcoding capabilities automatically maps HTTP verbs and JSON payloads to gRPC methods and Protobuf messages based on annotations in the .proto file. This enables a single gRPC backend to serve both gRPC and REST clients without maintaining two codebases.

In practice, gRPC deployments behind an API gateway typically use a mix of pure gRPC proxying, gRPC-Web for browser access, and transcoding to serve REST clients.

How Apache APISIX Handles gRPC#

Apache APISIX provides native gRPC support across all three integration patterns described above.

Native gRPC Proxying#

APISIX proxies gRPC traffic natively over HTTP/2, supporting unary and streaming calls. Routes can be configured with gRPC-specific upstream settings, and the full plugin ecosystem applies to gRPC routes: authentication (JWT, key-auth), rate limiting, circuit breaking, and observability all work transparently on gRPC traffic.

gRPC-Web Support#

The grpc-web plugin enables browser clients to communicate with gRPC backends through APISIX. The plugin handles the protocol translation between gRPC-Web and native gRPC, allowing frontend teams to consume gRPC services directly without building a REST translation layer. This reduces the API surface area and eliminates a class of contract synchronization bugs.

HTTP/JSON to gRPC Transcoding#

The grpc-transcode plugin maps REST endpoints to gRPC methods using the Protobuf descriptor. After uploading the .proto file to APISIX, the plugin automatically exposes each gRPC method as an HTTP endpoint, translating JSON request bodies to Protobuf messages and Protobuf responses back to JSON. This is particularly valuable for organizations migrating from REST to gRPC incrementally, as existing REST clients continue working while backends are rewritten.

APISIX's gRPC performance is notable: internal benchmarks show gRPC proxying at approximately 15,000 RPS per CPU core with 0.3 milliseconds of added latency, comparable to its HTTP/1.1 proxying performance. The getting started guide includes gRPC configuration examples.

gRPC Best Practices#

  1. Version your .proto files carefully. Protobuf supports backward-compatible field additions, but removing or renaming fields breaks clients. Use reserved field numbers for deleted fields.

  2. Set deadlines on every RPC. Without a deadline, a hung upstream can hold client resources indefinitely. Missing or overly generous RPC deadlines are a common cause of cascading failures in distributed systems.

  3. Use load balancing at the connection level. Because HTTP/2 multiplexes many RPCs over one connection, TCP-level load balancing (L4) is insufficient. Use L7 load balancing or client-side balancing to distribute RPCs across backend instances.

  4. Implement health checking. gRPC defines a standard health checking protocol (grpc.health.v1.Health). Use it for readiness probes and load balancer health checks.

  5. Monitor per-method metrics. Track latency, error rate, and throughput per gRPC method, not just per service. A slow GetOrder method is invisible if aggregated with a fast ListOrders method.

FAQ#

Can gRPC completely replace REST?#

Not in most architectures. gRPC excels at internal service-to-service communication where performance, type safety, and streaming matter. REST remains superior for public APIs due to native browser support, human-readable payloads, HTTP caching, and broader tooling familiarity. The most common pattern is gRPC internally with REST or GraphQL at the edge, using an API gateway for protocol translation.

How do I debug gRPC calls if the payloads are binary?#

Tools like grpcurl (a curl equivalent for gRPC), Postman (which added gRPC support in 2023), and BloomRPC provide human-readable interaction with gRPC services. For production debugging, structured logging at the gateway layer that decodes Protobuf messages into JSON is the most effective approach. APISIX's logging plugins can capture gRPC request and response metadata for observability.

What is the performance difference between gRPC and REST in practice?#

In controlled benchmarks, gRPC typically delivers significantly higher throughput and lower latency than REST/JSON for equivalent workloads. The gains come from binary serialization (smaller payloads, faster encoding), HTTP/2 multiplexing (fewer connections, no head-of-line blocking), and code-generated clients (no reflection or manual parsing). The exact improvement depends on payload size, call frequency, and network conditions. Organizations migrating from REST to gRPC commonly report meaningful reductions in inter-service latency in production.

Does gRPC work with WebAssembly or edge computing?#

Yes. Protobuf serialization libraries exist for languages targeting WebAssembly, and gRPC-Web enables browser-based Wasm applications to call gRPC backends. For edge computing, gRPC's compact payloads and efficient serialization are advantageous on bandwidth-constrained links. Several CDN providers, including Cloudflare and Fastly, now support gRPC proxying at the edge as of 2025.

· One min read

Mutual TLS (mTLS) is a security protocol where both the client and server authenticate each other using X.509 certificates during the TLS handshake. Unlike standard TLS, which only verifies the server's identity, mTLS ensures that both parties prove they are who they claim to be before any application data is exchanged.

Why Mutual TLS Matters#

Standard TLS protects the vast majority of internet traffic today. The overwhelming majority of web traffic now uses HTTPS. However, standard TLS only solves half the authentication problem: clients verify that the server holds a valid certificate, but servers have no cryptographic assurance about the client's identity. They rely on application-layer mechanisms like API keys, tokens, or passwords instead.

This gap becomes critical in zero-trust architectures, service-to-service communication, and regulated environments where network-level identity verification is required. mTLS closes this gap by making identity verification bilateral and cryptographic.

mTLS vs Standard TLS#

AspectStandard TLSMutual TLS (mTLS)
Server authenticatedYesYes
Client authenticatedNo (application layer)Yes (certificate)
Client certificate requiredNoYes
Certificate management complexityLowHigh
Typical use casePublic websites, APIsInternal services, zero-trust, IoT
Identity assurance levelServer onlyBoth endpoints
Performance overheadBaseline~5-10% additional handshake time
Common in browsersYesRare (except enterprise)

mTLS has become the predominant service-to-service authentication mechanism in zero-trust network access (ZTNA) implementations, reflecting growing recognition that network perimeter-based security is insufficient for distributed architectures.

How the mTLS Handshake Works#

The mTLS handshake extends the standard TLS 1.3 handshake with additional steps for client certificate exchange. Here is the full sequence:

Step 1: Client Hello. The client initiates the connection by sending supported cipher suites, TLS version, and a random value to the server. This step is identical to standard TLS.

Step 2: Server Hello and Server Certificate. The server responds with its chosen cipher suite, its own random value, and its X.509 certificate. The server also sends a CertificateRequest message, signaling that the client must present a certificate. In standard TLS, this CertificateRequest is absent.

Step 3: Client Verifies Server Certificate. The client validates the server's certificate against its trust store, checking the certificate chain, expiration, revocation status (via CRL or OCSP), and that the subject matches the expected server identity.

Step 4: Client Certificate Submission. The client sends its own X.509 certificate to the server along with a CertificateVerify message containing a digital signature over the handshake transcript, proving possession of the private key corresponding to the certificate.

Step 5: Server Verifies Client Certificate. The server validates the client certificate against its configured Certificate Authority (CA) trust store, checks the certificate chain, verifies the CertificateVerify signature, and optionally checks revocation status. If verification fails, the server terminates the connection immediately.

Step 6: Secure Channel Established. Both parties derive session keys from the shared secret. All subsequent communication is encrypted and authenticated in both directions.

The entire handshake adds approximately 1-2 milliseconds of latency compared to standard TLS, depending on certificate chain depth and revocation checking methods.

Use Cases for Mutual TLS#

Zero-Trust Architecture#

Zero-trust security models operate on the principle of "never trust, always verify." Every service must authenticate cryptographically before communicating, regardless of network location. mTLS provides the transport-layer foundation for this model. The industry trend is strongly toward zero-trust for new network access deployments, with mTLS as the predominant service identity mechanism.

Microservices Communication#

In microservices architectures, dozens or hundreds of services communicate over internal networks. Without mTLS, a compromised service can impersonate any other service on the network. mTLS ensures that Service A can only communicate with Service B if both hold certificates signed by a trusted CA. Service meshes like Istio and Linkerd automate mTLS certificate issuance and rotation for every service pod, making deployment tractable at scale.

IoT Device Authentication#

IoT devices operate in physically untrusted environments where API keys or passwords can be extracted from device firmware. mTLS binds device identity to a hardware-backed certificate, making impersonation significantly harder. Certificate-based authentication is widely adopted across IoT devices, with mTLS adoption growing rapidly in industrial and healthcare IoT deployments.

API Security and Partner Integration#

APIs exposed to partners or regulated industries often require stronger authentication than API keys provide. mTLS ensures that only clients holding a certificate issued by the API provider's CA can establish a connection, providing defense-in-depth before any application-layer authentication occurs. Financial services APIs governed by Open Banking regulations in the EU, UK, and Australia mandate mTLS for third-party provider connections.

Challenges of Implementing mTLS#

Certificate Lifecycle Management#

Every client and server in an mTLS deployment needs a valid certificate. For an organization running 500 microservices with 3 replicas each, that means managing 1,500 certificates with their own issuance, renewal, and revocation cycles. Without automation, this becomes operationally unsustainable. Tools like cert-manager (for Kubernetes), HashiCorp Vault, and SPIFFE/SPIRE address this by automating certificate lifecycle operations.

Certificate-related outages are common in organizations managing large certificate inventories, and remediation can be costly. Automated rotation is not optional for production mTLS deployments.

Certificate Rotation#

Short-lived certificates (hours or days) reduce the blast radius of a compromised key but increase rotation frequency. Long-lived certificates (months or years) reduce operational churn but increase exposure time if compromised. The industry trend moves toward short-lived certificates: SPIFFE recommends certificate lifetimes of 1 hour for workload identities, with automated rotation handled by the SPIRE agent.

Performance Considerations#

mTLS adds computational overhead from asymmetric cryptography during the handshake and certificate validation. For services handling thousands of new connections per second, this overhead can be measurable. Connection pooling and keep-alive headers amortize the handshake cost across many requests. TLS session resumption (via session tickets or pre-shared keys) eliminates the full handshake on reconnection, reducing the per-request cost to near zero for long-lived connections.

Debugging and Observability#

When mTLS connections fail, diagnosing the cause is harder than debugging standard TLS failures. Common failure modes include expired certificates, CA trust store mismatches, certificate revocation, and clock skew between endpoints. Structured logging of TLS handshake events, certificate serial numbers, and validation errors is essential for operational mTLS deployments.

How to Configure mTLS in Apache APISIX#

Apache APISIX supports mTLS at both the edge (between clients and APISIX) and internally (between APISIX and upstream services). The configuration uses APISIX's SSL resource and route-level settings.

Client-to-Gateway mTLS#

To require client certificates for incoming connections, configure an SSL resource with the CA certificate that should be trusted for client authentication. APISIX will reject any client that does not present a certificate signed by the specified CA. See the mTLS documentation for the full SSL resource schema and configuration examples.

Gateway-to-Upstream mTLS#

When upstream services require mTLS, configure the upstream resource with the client certificate and key that APISIX should present. This ensures APISIX authenticates itself to backend services, maintaining the zero-trust chain from edge to origin. The upstream TLS configuration section covers the required fields.

Per-Route mTLS Policies#

APISIX allows different mTLS policies per route, enabling gradual rollout. Internal admin APIs can require mTLS immediately while public-facing routes continue using standard TLS with application-layer authentication. This granularity is configured through the route's ssl and upstream settings.

The certificate management guide covers integration with cert-manager and external CA providers for automated certificate rotation within APISIX deployments.

mTLS Best Practices#

  1. Automate certificate lifecycle. Never rely on manual certificate issuance or renewal for production mTLS. Use cert-manager, Vault, or SPIRE.

  2. Use short-lived certificates. Target lifetimes of 24 hours or less for workload certificates. Rotate automatically before expiration.

  3. Separate CAs by trust domain. Do not use the same CA for internal service certificates and external partner certificates. Maintain distinct trust hierarchies.

  4. Monitor certificate expiration. Set alerting thresholds at 7 days, 3 days, and 1 day before expiration. Track certificate inventory centrally.

  5. Enable OCSP stapling. Reduce certificate validation latency by stapling OCSP responses at the server rather than requiring clients to contact the CA's OCSP responder.

FAQ#

What happens if a client certificate expires during an active mTLS connection?#

Existing connections continue functioning until they are closed because TLS authentication occurs during the handshake, not continuously. However, any new connection attempt with the expired certificate will fail. This is why short-lived certificates combined with connection draining during rotation are important: they ensure that stale credentials are phased out promptly without disrupting in-flight requests.

Is mTLS the same as two-way SSL?#

Yes. "Two-way SSL," "mutual SSL," and "mutual TLS" all describe the same mechanism: both endpoints present and verify certificates. The terminology "mutual TLS" is preferred in modern usage because TLS superseded SSL over two decades ago, and all current implementations use TLS 1.2 or TLS 1.3 rather than any SSL version.

Does mTLS replace the need for API keys or OAuth tokens?#

No. mTLS authenticates the transport-layer identity (which machine or service is connecting), while API keys and OAuth tokens authenticate the application-layer identity (which user, application, or tenant is making the request). In a defense-in-depth strategy, mTLS and application-layer authentication serve complementary roles. mTLS ensures only authorized services can reach the endpoint; tokens and keys determine what those services are allowed to do.

How does mTLS perform at scale in Kubernetes?#

In Kubernetes environments with service meshes, mTLS scales well because certificate issuance and rotation are fully automated by the mesh control plane. Istio, for example, issues and rotates certificates for every pod automatically using its built-in CA. The performance impact is primarily on new connections (the handshake), which is amortized by connection pooling. Organizations running mTLS across 10,000+ pods report negligible steady-state performance impact, with the main operational cost being control plane resource consumption for certificate management.