DEV Community

Cover image for Simplifying Microservices with Istio and Service Mesh Architecture

Simplifying Microservices with Istio and Service Mesh Architecture

As apps shift from monoliths to microservices, managing service-to-service communication becomes complex. Developers must handle traffic routing, retries, timeouts, load balancing, TLS encryption, metrics, and logs for each service. This leads to duplicated code and operational complexity.

Service mesh is an infrastructure layer that manages service-to-service communication transparently. Istio, a popular open-source service mesh, addresses these challenges by deploying Envoy proxies as sidecars to your pods. The proxies intercept traffic and apply consistent policies without requiring code changes.

I use a coffee shop microservices app as a running example, since I love coffee. The app includes:

  • order-service (handles customer orders)
  • payment-service (processes payments)
  • inventory-service (manages coffee inventory)

Core Capabilities of Istio

Istio provides powerful capabilities for traffic management, security, and observability within a microservices environment.

Capability Description
Traffic Management Define routing, fault injection, and load balancing using CRDs like VirtualService, DestinationRule, and ServiceEntry
Traffic Routing Flexible traffic routing configurations using VirtualService and DestinationRule resources.
Resiliency Enforce timeouts, retries, circuit breaking, and failover without changing app logic
Mesh Extension Integrate external services, Virtual Machines (VMs), and custom Envoy configurations Apply strong authentication (AuthN) and authorization (AuthZ) using mTLS and identity-based policies
Security Apply strong authentication (AuthN) and authorization (AuthZ) using mTLS and identity-based policies
Observability Export telemetry (metrics, traces, logs) using integrations like SigNoz, Prometheus, Jaeger, and Kiali

Installing Istio

Istio provides flexible installation options to support different environments and use cases. You can install Istio using:

  • istioctl CLI
  • Helm charts
  • Istio Operator (for GitOps and declarative installs)

Istio includes predefined installation profiles optimized for different scenarios. Each profile configures control and data plane components through the IstioOperator resource.

istioctl profile list

Istio provides different profiles:

Profile Description
default Production-ready; installs control plane and ingress gateway.
demo Best for demos and learning; enables tracing, logging, ingress, and egress.
minimal Control plane only; no gateways.
external Used in remote clusters for multi-cluster mesh; installs nothing.
empty Baseline config for custom setups.
preview Includes experimental features.
ambient Sets up sidecar-less ambient mesh (Alpha; not for production).

To install Istio using the Istio CLI, you can use the --set flag and specify the profile like this:

istioctl install \--set profile=demo

Note: Use the demo profile during development to enable full telemetry. For production, switch to default to improve performance and security.

Get full configuration of a profile:

istioctl profile dump demo

Compare two profiles:

istioctl profile diff demo default

Combining Helm and Operator

You can use the IstioOperator resource alongside Helm:

  • Use Helm to install base components.
  • Use IstioOperator to apply profile-level and mesh-level configurations.

This modular setup is useful in environments where GitOps or CI/CD pipelines manage different aspects of configuration.

An image displaying Combining Helm and Operator

Istio Architecture

Istio consists of:

  • Data Plane: Lightweight Envoy proxies injected as sidecars to each pod. These proxies handle all ingress and egress traffic for the pod.
  • Control Plane: The Istiod component configures and manages the behavior of Envoy proxies by pushing policies and configuration dynamically.

The overall architecture of an Istio-based application.

The overall architecture of an Istio-based application.
Image credits: Istio documentation.

For example, each pod in the Coffee Shop app has a sidecar Envoy proxy that intercepts all traffic. This enables Istio to provide seamless:

  • Traffic routing
  • mTLS encryption
  • Metrics and tracing

An architecture image of coffee shop app

Sidecar

Manually modifying manifests to add sidecars is error-prone and not scalable. Istio supports two approaches:

Manual Sidecar: This method involves using the istioctl CLI to manually inject sidecars into YAML manifests. You can use the CLI to inject sidecars into your YAML manifests. Run the following command:

istioctl kube-inject \-f deployment.yaml | kubectl apply \-f \-

Automatic Sidecar: This is the recommended approach for most use cases. Istio uses a Mutating Admission Webhook to inject sidecars into all pods created in a namespace labeled with istio-injection=enabled.

Note: All new pods in that namespace get sidecars injected automatically.

To enable for a namespace, run:

kubectl label namespace coffee-shop istio-injection=enabled

Routing Traffic Through Sidecars

Istio uses iptables rules or CNI plugins to transparently route traffic through the Envoy sidecar.

An init container sets up the iptables rules before the application starts. This ensures:

  • Outbound traffic from the app is redirected to the Envoy sidecar
  • Inbound traffic hits the sidecar before reaching the app

For example, a request from order-service to payment-service is transparently routed via sidecars.

An image of request from order-service to payment-service routed via sidecars.

Configuring Envoy Proxies

When an app makes a call to another service, that call is now intercepted by its sidecar. The job of configuring the proxies with all the information they need to handle both incoming and outgoing traffic falls to the Istio control plane.

The Istio control plane configures all Envoy proxies dynamically using xDS APIs:

Feature Delivered via Istio Control Plane
Route discovery VirtualService and DestinationRule updates
Load balancing Weighted or subset-based routing
Retry and timeout Policy enforcement without app changes
mTLS and security Dynamic certificate provisioning

Istio automates all sidecar configurations according to the current mesh topology. Each time services are added, removed, or updated, Istio ensures that the latest configurations—network, routing, or security policies—are distributed to the appropriate sidecars.

A flow of data image when order-service wants to call payment-service in coffee shop example

For example, order-service wants to call payment-service:

  • Istio discovers endpoints of payment-service
  • Pushes routing config to order-service’s proxy
  • Applies retries, timeouts, load balancing, and TLS

Traffic Management

Istio's traffic management relies on Custom Resources (CRDs) like VirtualService, DestinationRule, and ServiceEntry to define fine-grained routing, resiliency, and fault injection policies

Ingress and Egress Gateways

Istio uses the Gateway resource to manage how Envoy proxies handle inbound and outbound traffic. Unlike Kubernetes Ingress, Istio Gateways provide richer Layer 7 routing.

Gateway Type Purpose Example Scenario
Ingress Gateway Accepts external traffic into the mesh Client → Ingress Gateway → order-service
Egress Gateway Manages outbound traffic to external APIs payment-service → Egress Gateway → Payment API

Example: Gateway and VirtualService Configuration

apiVersion: networking.istio.io/v1beta1

kind: Gateway

metadata:

  name: coffee-ingress

spec:

  selector:

    istio: ingressgateway

  servers:

    \- port:

        number: 80

        name: http

        protocol: HTTP

      hosts:

        \- "order.coffee.com"
Enter fullscreen mode Exit fullscreen mode

VirtualService with Gateway

To route external traffic to internal services, a Gateway must be used in conjunction with a VirtualService. If a VirtualService is not bound to a Gateway, Envoy returns an HTTP 404, indicating no route has been defined.

A flow of data with virtualService and Gateway in coffee shop app

Create a corresponding VirtualService that binds to the Gateway:

apiVersion: networking.istio.io/v1beta1

kind: VirtualService

metadata:

  name: order-route

spec:

  hosts:

  \- "\*"

  gateways:

  \- coffee-gateway

  http:

  \- route:

    \- destination:

        host: order-service.default.svc.cluster.local

        port:

          number: 80
Enter fullscreen mode Exit fullscreen mode

Apply the configuration:

kubectl apply \-f order-route.yaml

Test the route:

curl \-v http://$GATEWAY_IP/

Note: Usually, istio-ingressgateway service is exposed using the Kubernetes LoadBalancer type, which assigns an external IP to receive HTTP(S) traffic.

How the LoadBalancer Kubernetes service type works depends on how and where you run the Kubernetes cluster.

Platform LoadBalancer Behavior
AWS, GCP, Azure Provisions a cloud load balancer and assigns external IP.
Minikube Requires minikube tunnel to simulate external access.

For example, In coffee shop app, gateways are essential to expose services like order-service, payment-service, and inventory-service to the outside world or external systems.

An image displaying gateways for services in coffee shop app.

Traffic Routing and Resiliency

Istio allows flexible traffic routing configurations using VirtualService and DestinationRule resources.

Resource Purpose
VirtualService Defines traffic routing rules to one or more destinations.
DestinationRule Configures policies for routed traffic, such as load balancing and TLS.
ServiceEntry Adds external services to the mesh registry.

For example, consider a coffee-shop app with these services:

  • Web-frontend: the UI for customers.
  • Customer-service: handles customer profiles.
    • Two versions of customer-service: v1 and v2.

An image of coffee shop Kubernetes pod.

You can define different subsets of a service, typically based on labels in the pod spec (e.g., version: v1 or version: v2 for a customer-service). Pods are labeled with version: v1 or version: v2. You set subsets in a DestinationRule:

apiVersion: networking.istio.io/v1beta1

kind: DestinationRule

metadata:

  name: customer-service

spec:

  host: customer-service.default.svc.cluster.local

  subsets:

  \- name: v1

    labels:

      version: v1

  \- name: v2

    labels:

      version: v2
Enter fullscreen mode Exit fullscreen mode

Routing Traffic with VirtualService

In the VirtualService, you can specify the traffic matching and routing rules that decide which destinations traffic is routed to.

Note: To generate some traffic, open a separate terminal window and start making requests to the GATEWAY_IP in an endless loop:

export GATEWAY_IP=$(kubectl get svc \-n istio-system istio-ingressgateway \-ojsonpath='{.status.loadBalancer.ingress\[0\].ip}')

while true; do curl http://$GATEWAY_IP/; done
Enter fullscreen mode Exit fullscreen mode

Weight-Based Routing: Distributes traffic across different subsets of the same service based on assigned weights (e.g., 70% to v1 and 30% to v2).

apiVersion: networking.istio.io/v1beta1

kind: VirtualService

metadata:

  name: customer-service

spec:

  hosts:

  \- customer-service.default.svc.cluster.local

  http:

  \- route:

    \- destination:

        host: customer-service.default.svc.cluster.local

        subset: v1

      weight: 70

    \- destination:

        host: customer-service.default.svc.cluster.local

        subset: v2

      weight: 30
Enter fullscreen mode Exit fullscreen mode

Match-Based Routing: Routes traffic based on specific conditions, such as HTTP headers (e.g., User-Agent) or URI paths.

http:

\- match:

  \- headers:

      user-agent:

        regex: ".\*Firefox.\*"

  route:

  \- destination:

      host: customer-service.default.svc.cluster.local

      subset: v1

\- route:

  \- destination:

      host: customer-service.default.svc.cluster.local

      subset: v2
Enter fullscreen mode Exit fullscreen mode

Redirect and Rewrite: Redirects traffic (HTTP 301) to a different URI or hostname, or rewrites path prefixes before forwarding. Note that redirect and destination fields are mutually exclusive.

http:

\- match:

  \- uri:

      exact: /v1/hello

  redirect:

    uri: /v2/hello

    authority: hello.default.svc.cluster.local

Rewrite path prefix:

http:

\- match:

  \- uri:

      prefix: /v1/api

  rewrite:

    uri: /v2/api

  route:

  \- destination:

      host: customer-service.default.svc.cluster.local
Enter fullscreen mode Exit fullscreen mode

The redirect and destination fields are mutually exclusive. If we use the redirect, there is no need to set the destination.

Mirroring Traffic: Sends a copy of live traffic to another service version (e.g., mirroring 100% of traffic sent to v1 to v2). This "fire and forget" mechanism is useful for testing and debugging with production traffic.

http:

\- route:

  \- destination:

      host: customer-service.default.svc.cluster.local

      subset: v1

    weight: 100

  mirror:

    host: customer-service.default.svc.cluster.local

    subset: v2

  mirrorPercentage:

    value: 100.0
Enter fullscreen mode Exit fullscreen mode

Header Manipulation: Allows you to add, set, or remove request and response headers, either for individual destinations or all destinations within a VirtualService.

http:

\- headers:

    request:

      set:

        debug: "true"

  route:

  \- destination:

      host: customer-service.default.svc.cluster.local

      subset: v2

    weight: 30

  \- destination:

      host: customer-service.default.svc.cluster.local

      subset: v1

    headers:

      response:

        remove:

        \- x-api-key

    weight: 70
Enter fullscreen mode Exit fullscreen mode

In the above example, you set a request header debug: true for all traffic sent to the host. You are removing a response header called x-api-key. So, whenever the traffic reaches the subset v1, the response from the service will not include the x-api-key header.

AND Matching: Rules can combine multiple conditions using AND logic (e.g., matching a URI prefix and a specific header).

match:

\- uri:

    prefix: /v1

  headers:

    user:

      exact: debug
Enter fullscreen mode Exit fullscreen mode

OR Matching: Rules can combine multiple conditions using OR logic (matching either a URI prefix or a header).

match:

\- uri:

    prefix: /v1

\- headers:

    user:

      exact: debug
Enter fullscreen mode Exit fullscreen mode

If the first match does not evaluate to true, the algorithm moves to the second match field and tries to match the header. If you omit the match field on the route, it will continually evaluate to true.

Note: When using either of the two options, make sure you provide a fallback route if applicable. That way, if traffic doesn’t match any of the conditions, it could still be routed to a “default” route.

Resiliency Patterns

Istio enables the application of resiliency policies at the network layer, reducing the need for application code changes

Note: Both retries and timeouts happen on the client side.

Timeouts : If a request exceeds the timeout, Envoy responds with HTTP 408.

http:

\- route:

  \- destination:

      host: customer-service.default.svc.cluster.local

      subset: v1

  timeout: 5s
Enter fullscreen mode Exit fullscreen mode

Retries: If the first pod fails, Envoy retries with a different healthy endpoint.

retries:

  attempts: 3

  perTryTimeout: 2s

  retryOn: gateway-error,connect-failure,reset
Enter fullscreen mode Exit fullscreen mode

Circuit Breaking with Outlier Detection: This prevents cascading failures by automatically rejecting requests to overloaded or unhealthy services.

Istio implements circuit breaking using outlier detection, a passive health-checking mechanism. Envoy doesn't actively probe services but observes runtime metrics such as failure rate, latency, and connection health.

apiVersion: networking.istio.io/v1beta1

kind: DestinationRule

metadata:

  name: customer-service

spec:

  host: customer-service

  trafficPolicy:

    outlierDetection:

      consecutive5xxErrors: 1

      interval: 1s

      baseEjectionTime: 3m

      maxEjectionPercent: 100
Enter fullscreen mode Exit fullscreen mode
  • consecutive5xxErrors: Number of consecutive 5xx responses before ejection.
  • interval: How often Envoy checks pod health.
  • baseEjectionTime: Initial duration a pod remains ejected. This increases with repeated failures.
  • maxEjectionPercent: Caps the percentage of pods that can be ejected.

When thresholds are met, Envoy temporarily removes the unhealthy pod from the load-balancing pool. Over time, the pod is gradually reintroduced if it recovers.

Failure Injection: This allows you to simulate network failures or delays. This helps validate your service's resilience and fallback mechanisms.

Istio supports two types of fault injection in the VirtualService:

  • Abort: Simulate HTTP errors by terminating requests with a specified status code.
  • Delay: Introduce artificial latency before forwarding requests.

Abort 30% of requests:

fault:

  abort:

    percentage:

      value: 30

    httpStatus: 404  
Enter fullscreen mode Exit fullscreen mode

Note: If you omit the percentage field, all matching requests will be aborted.

Inject delay to 5% of requests:

fault:

  delay:

    percentage:

      value: 5

    fixedDelay: 3s
Enter fullscreen mode Exit fullscreen mode

Fault injection only affects services matched by the VirtualService. It does not impact other consumers.

Extending the Istio Mesh

Istio provides mechanisms to bring external services and Virtual Machines (VMs) into the mesh, and to customize Envoy proxies.

Bringing External Services into the Mesh

Istio tracks internal services automatically. To include external or non-Kubernetes services, use the ServiceEntry custom resource. This allows you to manage traffic and apply policies like retries, timeouts, mirroring, and fault injection to external endpoints.

For example, the Coffee Shop microservices application:

  • payment-service needs to call an external payment API (mesh-external)
  • rewards-service communicates with an internal legacy database (mesh-internal)

MESH_EXTERNAL: Used for services outside the mesh (e.g., www.googleapis.com), typically with resolution: DNS.

apiVersion: networking.istio.io/v1beta1  
kind: ServiceEntry  
metadata:  
  name: googleapis-svc-entry  
spec:  
  hosts:  
  \- www.googleapis.com  
  location: MESH_EXTERNAL  
  resolution: DNS  
  ports:  
  \- number: 443  
    name: https  
    protocol: TLS
Enter fullscreen mode Exit fullscreen mode
  • location: MESH_EXTERNAL: Specifies the service is outside the mesh.
  • resolution: DNS: Istio uses DNS to resolve the host.

MESH_INTERNAL: Used for services within the mesh that do not have DNS, requiring resolution: STATIC and explicit IP addresses. The hosts field is optional with STATIC resolution. You can also use workloadSelector for endpoint selection.

apiVersion: networking.istio.io/v1beta1  
kind: ServiceEntry  
metadata:  
  name: legacy-loyalty-db  
spec:  
  addresses:  
  \- 192.192.192.192/24  
  ports:  
  \- number: 27018  
    name: mongodb  
    protocol: MONGO  
  location: MESH_INTERNAL  
  resolution: STATIC  
  endpoints:  
  \- address: 10.0.0.2  
  \- address: 10.0.0.3  
Enter fullscreen mode Exit fullscreen mode

Note: The hosts field is optional when using STATIC resolution.

Outbound Traffic Policy: The REGISTRY_ONLY outbound traffic policy can be configured to ensure traffic is only allowed to known services registered in the mesh.

Configure Mesh to Registry-Only:

istioctl install \--set profile=demo \--set meshConfig.outboundTrafficPolicy.mode=REGISTRY_ONLY

Confirm Configuration:

kubectl get cm \-n istio-system istio \-o yaml | grep outboundTrafficPolicy

Centralized Egress via Gateway

Use an Egress Gateway to manage and monitor all outbound traffic. This setup enables centralized TLS termination, access control, and observability.

Required resources:

  • AuthorizationPolicy
  • Gateway
  • VirtualService
  • DestinationRule

Onboarding VMs into the Mesh

VMs can join the mesh using the WorkloadEntry and WorkloadGroup resources. Istio treats VMs similarly to Kubernetes pods, assigning identities based on namespace and service account.

The general procedure for onboarding a VM can be summarized by the following steps:

  1. Install the Istio sidecar using .deb or .rpm
  2. Define a WorkloadGroup:
  apiVersion: networking.istio.io/v1beta1  
  kind: WorkloadGroup  
  metadata:  
    name: barista-vm  
    namespace: coffee-shop  
  spec:  
    metadata:  
      labels:  
        app: barista-service  
    template:  
      serviceAccount: barista-account
Enter fullscreen mode Exit fullscreen mode
  1. Configure the east-west gateway: ./samples/multicluster/gen-eastwest-gateway.sh \--single-cluster | istioctl install \-y \-f \-
  2. Expose istiod: kubectl apply \-n istio-system \-f ./samples/multicluster/expose-istiod.yaml
  3. Generate and copy configs: istioctl x workload entry configure \-f barista.yaml \-o ./output-dir
  4. Place files in the correct locations and start the sidecar on the VM.

Communication in extending to VM:

  • Services in the cluster can reach VMs using DNS.
  • VMs can access services inside Kubernetes using mesh DNS.

An east-west gateway is necessary to enable communication between the sidecar that will be running on the VM and istiod, the Istio control plane (see the Istio documentation).

To Install the East-West Gateway and Expose Istiod

  1. Install the east-west gateway: A./samples/multicluster/gen-eastwest-gateway.sh --single-cluster | istioctl install -y -f -
    • If you list the pods in the istio-system namespace you’ll notice the istio-eastwestgateway instance was created.
  2. Expose istiod though the east-west gateway: kubectl apply -n istio-system -f ./samples/multicluster/expose-istiod.yaml

For example, consider

apiVersion: networking.istio.io/v1beta1  
kind: WorkloadGroup  
metadata:  
  name: barista-vm  
  namespace: coffee-shop  
spec:  
  metadata:  
    labels:  
      app: barista-service  
  template:  
    serviceAccount: barista-account  
Enter fullscreen mode Exit fullscreen mode

An image displaying flow of data from Coffee Shop pods to VM + Sidecar.

Customizing and Extending Envoy Proxies

Istio automatically generates the Envoy configuration for each proxy. However, for advanced use cases, you can customize this configuration and extend Envoy's functionality.

Envoy's configuration is structured into several key components:

  • Listeners: Network locations (IP and port) where Envoy listens for incoming connections and requests. Istio generates multiple listeners for each sidecar.
  • Filters: Ordered lists of processing logic that a request flows through (Listener, Network, and HTTP filters). The router filter is typically the last HTTP filter and is responsible for routing traffic.
  • Routes: URI/path-based traffic routing rules defined within the route configuration. These rules match incoming requests and specify where traffic should be sent.
  • Clusters: Groups of similar upstream hosts (destinations or servers), analogous to Kubernetes Services, that accept traffic.
  • Endpoints: Concrete IP:port pairs within a cluster, representing the specific addresses where traffic can be sent.

A flow displaying extending envoy proxies for coffee shop app

For example, when a request reaches coffee-frontend:

  • Envoy listens on port 15001.
  • Filters inspect and process the request.
  • Routing sends it to the barista-service cluster.
  • One of the barista pods (endpoint) handles the request.

You can inspect the Envoy configuration using the istioctl proxy-config command. For example:

istioctl proxy-config clusters coffee-frontend-xyz \--namespace coffee-shop

The EnvoyFilter resource allows you to customize portions of the auto-generated Envoy proxy configuration by patching existing settings. This enables updating values, adding or removing filters, or creating new listeners and clusters.

  • Application Scope: EnvoyFilter resources can be applied at three levels: globally (affecting all proxies in the mesh), per namespace, or to specific workloads.
  • Patch Location (applyTo): You can target specific configuration sections, such as LISTENER, HTTP_FILTER, NETWORK_FILTER, or CLUSTER.
  • Patch Target (match): The scope can be narrowed using context (e.g., SIDECAR_INBOUND, SIDECAR_OUTBOUND, GATEWAY), listener properties, route configuration, or cluster properties.
  • Patch Action (patch): Defines how the patch is applied, with operations like MERGE, ADD, REMOVE, INSERT_BEFORE, or INSERT_AFTER.

Example to patch with EnvoyFilter

\- applyTo: EXTENSION_CONFIG  
  patch:  
    operation: ADD  
    value:  
      name: custom-metrics  
      typed_config:  
        "@type": type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm  
        config:  
          root_id: metrics-root  
          vm_config:  
            vm_id: metrics-vm  
            runtime: envoy.wasm.runtime.v8  
            code:  
              remote:  
                http_uri:  
                  uri: http://wasm-module-uri
Enter fullscreen mode Exit fullscreen mode

Extending Envoy with WebAssembly

Envoy's functionality can be extended using custom filters written in different languages:

  • C++: Offers native, high-performance extensions but requires rebuilding Envoy.
  • Lua: Script-based, suitable for simpler use cases.
  • WebAssembly (Wasm): Enables run-time loaded plugins compiled from languages like Rust, Go, or AssemblyScript. Wasm plugins run in a sandboxed virtual machine (VM), providing isolation and memory safety.

Wasm allows dynamic extensibility of the Envoy data plane without needing to rebuild Envoy or manually modify its configurations. Istio's istio-agent handles the distribution of Wasm plugins, fetching them from registries and mounting them into Envoy's file system.

For example, in the Coffee Shop app: Use a WASM filter to collect metrics on espresso orders handled by barista-service. This plugin runs inside the Envoy proxy and logs telemetry data.

An image displaying flow in coffee shop app using WebAssembly

In Istio, Wasm enables customization of the Envoy data plane without rebuilding or manually modifying Envoy configurations. It introduces dynamic extensibility to the mesh.

Wasm Plugin Deployment Workflow:

  1. Compile Plugin
    • Use your SDK to generate .wasm file
  2. Publish to Registrydocker build -t registry.io/barista-metrics:v1 . docker push registry.io/barista-metrics:v1
  3. Deploy with WasmPlugin
apiVersion: extensions.istio.io/v1alpha1  
kind: WasmPlugin  
metadata:  
  name: barista-metrics  
  namespace: coffee-shop  
spec:  
  selector:  
    labels:  
      app: barista-service  
  url: oci://registry.io/barista-metrics:v1  
  pluginConfig:  
    trackEspresso: true  
    debug: true
Enter fullscreen mode Exit fullscreen mode

Plugin Source Options:

  • oci://: OCI registry
  • http://: Direct HTTP URL
  • /path/to/local: Local file path

Using WasmPlugin is preferred over EnvoyFilter as it simplifies deployment.

An image displaying OCI registry to Wasm Runtime.

Security

Istio enhances security by enforcing strong authentication (AuthN) and authorization (AuthZ) policies.

Authentication

Istio issues X.509 SPIFFE-compliant certificates to each pod, based on Kubernetes ServiceAccounts.

  • mTLS: Ensures both client and server verify each other’s identities.
  • Certificates: Automatically rotated and managed by Istio agent using SDS.

An image displaying Authentication flow using coffee shop services.

Automated Identity Provisioning: Istio automates workload identity through these components:

Component Role
Istio Agent Runs in the sidecar, manages certificates and bootstraps Envoy
SDS Envoy’s Secret Discovery Service; fetches certs dynamically
Istiod Acts as the Certificate Authority (CA); issues and rotates certs

When a sidecar starts, the Istio agent sends a Certificate Signing Request (CSR) to istiod. Once verified, istiod returns a signed certificate. This identity is used for secure communication between services. Certificates are rotated automatically.

For example, in the coffee shop microservices app:

  • barista-service runs with a sidecar.
  • Istio agent requests a certificate for the barista-service account.
  • CA authenticates the request and returns a signed certificate.
  • barista-service uses this identity for secure communication.

Mutual TLS (mTLS)

Mutual TLS ensures encrypted and authenticated communication. Both client and server validate each other using their certificates. Envoy sidecars handle this process transparently.

PeerAuthentication (Inbound): Configures the mTLS mode for incoming traffic to a service or workload.

apiVersion: security.istio.io/v1beta1

kind: PeerAuthentication

metadata:

  name: default

spec:

  mtls:

    mode: STRICT
Enter fullscreen mode Exit fullscreen mode

DestinationRule (Outbound): Configures the mTLS mode for outgoing traffic from a service or workload. This also applies to outgoing traffic through an egress gateway.

trafficPolicy:  
 tls:  
   mode: ISTIO_MUTUAL
Enter fullscreen mode Exit fullscreen mode

mTLS Modes Overview:

Mode Description
PERMISSIVE Accepts both plain text and mTLS connections (default for onboarding)
STRICT Only mTLS connections are allowed
ISTIO_MUTUAL Uses Istio-managed certificates for mTLS (recommended default)
SIMPLE One-way TLS (client verifies server)
MUTUAL mTLS using custom certificates
PASSTHROUGH Routes encrypted TLS traffic without termination
AUTO_PASSTHROUGH Automatically forwards TLS based on SNI (no VirtualService required)

Note: You can apply mTLS at mesh, namespace, workload, or port level and these modes apply to both Ingress and Egress gateways.

Example: Set STRICT mTLS for payment-service:

apiVersion: security.istio.io/v1beta1

kind: PeerAuthentication

metadata:

  name: payment-service-mtls

  namespace: coffee-shop

spec:

  selector:

    matchLabels:

      app: payment-service

  mtls:

    mode: STRICT  
Enter fullscreen mode Exit fullscreen mode

An image displaying flow from coffee-order to external payment-service.

Request Authentication (User Authentication)

Use RequestAuthentication to verify JWT tokens from end users. If a token is invalid or missing, the request is rejected. Valid tokens yield an authenticated principal for policy enforcement.

Example: Require JWT for customer-service:

apiVersion: security.istio.io/v1  
kind: RequestAuthentication  
metadata:  
  name: customer-jwt  
  namespace: coffee-shop  
spec:  
  selector:  
    matchLabels:  
      app: customer-service  
  jwtRules:  
  \- issuer: "https://auth.coffeeshop.com"  
    jwksUri: "https://auth.coffeeshop.com/.well-known/jwks.json"
Enter fullscreen mode Exit fullscreen mode

Authorization (Access Control)

Use the AuthorizationPolicy resource to enforce fine-grained control over what services or users can access. Policies use service identities (via mTLS) and user identities (via JWT).

For example, only allow authenticated users to call customer-service:

apiVersion: security.istio.io/v1

kind: AuthorizationPolicy

metadata:

  name: require-jwt

  namespace: coffee-shop

spec:

  selector:

    matchLabels:

      app: customer-service

  action: ALLOW

  rules:

  \- from:

    \- source:

        requestPrincipals: \["\*"\]  
Enter fullscreen mode Exit fullscreen mode

An image displaying Authorization in Istio.

Match Conditions: Rules can match requests based on:

Field Description
from Source identity: service accounts, IPs, JWT principals
to Operation match: HTTP methods, ports, paths
when Additional conditions: headers, claims, IPs

Example: Allow DELETE only from admin-service:

rules:  
\- from:  
  \- source:  
      principals: \["spiffe://cluster.local/ns/coffee-shop/sa/admin-service"\]  
  to:  
  \- operation:  
      methods: \["DELETE"\]  
      paths: \["/customers/\*"\]
Enter fullscreen mode Exit fullscreen mode

Action Types in AuthorizationPolicy:

Action Purpose
ALLOW Permit matching requests
DENY Block matching requests
CUSTOM Delegate evaluation to a custom extension
AUDIT Log matching requests without enforcing access decisions

Note: Istio evaluates policies in this order: CUSTOMDENYALLOW

Best Practices

  • Start with a DENY-all policy and incrementally allow access using ALLOW rules.
  • Assign dedicated ServiceAccounts per workload to ensure identity isolation.
  • Use STRICT mTLS once workloads are mesh-ready.
  • Combine PeerAuthentication, RequestAuthentication, and AuthorizationPolicy for zero-trust enforcement.

Observability

Observability is essential for understanding and operating microservices in production. Istio provides out-of-the-box observability by capturing telemetry at the network layer through sidecar proxies.

Istio enables deep insights across services by capturing:

  • Metrics: Quantitative measurements such as request latency or error rates.
  • Traces: End-to-end request flow across services.
  • Logs: Context-rich records for debugging.

These signals work together. For example, a spike in latency (metrics) leads you to a specific service call (trace), and the logs explain the failure.

Note: For more information, refer to Understanding Observability with OpenTelemetry and Coffee.

For the coffee shop example,the coffee shop app has three microservices:

  • order-service
  • payment-service
  • inventory-service

Each service includes an injected Envoy sidecar that automatically collects and exposes telemetry.

Setup for Observability

Install Istio using the demo profile to enable full telemetry:

istioctl install \--set profile=demo \-y

This profile enables 100% trace sampling—ideal for development. In production, reduce sampling to 1% to balance overhead.

Envoy sidecars expose Prometheus scrape endpoints. Metrics can also be accessed via each pod’s Envoy admin dashboard.

An image displaying Observability in coffee pod using Istio, SigNoz, Prometheus, and Grafana

Tracing and Logs with SigNoz

SigNoz is an OpenTelemetry-compatible observability tool that integrates seamlessly with Istio:

helm install signoz signoz/signoz -n platform

You can use the SigNoz UI to:

  • Search for traces by service (e.g., order-service)
  • Visualize trace duration and latency
  • Correlate logs and spans

Refer to SigNoz Installation Guide for setup instructions.

Optimization and Advanced Deployments

In large meshes, every sidecar receives service discovery updates for all mesh services. This can lead to:

  • Excessive configuration updates
  • Increased startup time for proxies

To limit this, use the Sidecar resource to restrict which services a workload can see.

An image displaying optimization of coffee app mesh

For example, a Sidecar resource can restrict outbound traffic (egress) from a coffee-frontend workload to only communicate with order-service and payment-service within its namespace.

apiVersion: networking.istio.io/v1beta1  
kind: Sidecar  
metadata:  
  name: coffee-frontend  
  namespace: coffee-shop  
spec:  
  workloadSelector:  
    labels:  
      app: coffee-frontend  
  egress:  
  \- hosts:  
    \- "order-service.coffee-shop.svc.cluster.local"  
    \- "payment-service.coffee-shop.svc.cluster.local"
Enter fullscreen mode Exit fullscreen mode

This configuration allows coffee-frontend to communicate only with order-service and payment-service, reducing its load.

Multi-Cluster Deployments

Deploying Istio across multiple Kubernetes clusters offers benefits such as high availability, failover capabilities, and organizational separation.

  • Network Models:

    • Single Network: Pods across different clusters can communicate directly.
    • Multiple Networks: East-west gateways are used to facilitate communication between clusters. An image displaying multiple networks connected by East-west gateway.
  • Control Plane Models:

    • Single Control Plane: A single Istiod instance manages all clusters. While simpler, it represents a single point of failure.
    • Per-Cluster Control Plane: Each cluster has its own Istiod instance, providing better high availability and isolation.
  • Mesh Models:

    • Single Mesh: A unified trust domain and configuration across all clusters.
    • Multi-Mesh Federation: Separate meshes can share trust bundles (root certificates), define shared ServiceEntry resources, and apply AuthorizationPolicy for secure cross-mesh communication.
  • Tenancy Models:

    • Soft Tenancy: Achieves isolation at the namespace level.
    • Hard Tenancy: Provides isolation at the cluster level with separate meshes.

Locality-Aware Load Balancing:

Use localityLbSetting to steer traffic based on geographic proximity (region, zone, sub-zone).

Failover Example: If all endpoints in us-west are unavailable, traffic fails over to us-east.

trafficPolicy:

  localityLbSetting:

    enabled: true

    failover:

      \- from: us-west

        to: us-east
Enter fullscreen mode Exit fullscreen mode

Weighted Distribution: This routes 50% of the traffic to local zone us-west1-a, 30% to the neighboring zone, and 20% to a remote zone.

trafficPolicy:

  localityLbSetting:

    enabled: true

    distribute:

      \- from: "us-west1/us-west1-a/\*"

        to:

          "us-west1/us-west1-a/\*": 50

          "us-west1/us-west1-b/\*": 30

          "us-east1/us-east1-a/\*": 20
Enter fullscreen mode Exit fullscreen mode

Wrapping Up: Istio as Your Mesh Barista

Istio helps transform your distributed microservices from a tangle of complexity into a well-orchestrated system. Whether it's fine-grained traffic routing, observability across services, enforcing security through mTLS, or extending the mesh with WebAssembly.

Summary image of the coffee shop with WebAssembly.

Like a well-run coffee shop, every component of your system needs to collaborate in real time. Istio acts as the operations manager behind the scenes, ensuring communication flows smoothly, issues are detected early, and only trusted interactions are allowed.

Heroku

Save time with this productivity hack.

See how Heroku MCP Server connects tools like Cursor to Heroku, so you can build, deploy, and manage apps—right from your editor.

Learn More

Top comments (0)

Best Practices for Running  Container WordPress on AWS (ECS, EFS, RDS, ELB) using CDK cover image

Best Practices for Running Container WordPress on AWS (ECS, EFS, RDS, ELB) using CDK

This post discusses the process of migrating a growing WordPress eShop business to AWS using AWS CDK for an easily scalable, high availability architecture. The detailed structure encompasses several pillars: Compute, Storage, Database, Cache, CDN, DNS, Security, and Backup.

Read full post

👋 Kindness is contagious

Discover fresh viewpoints in this insightful post, supported by our vibrant DEV Community. Every developer’s experience matters—add your thoughts and help us grow together.

A simple “thank you” can uplift the author and spark new discussions—leave yours below!

On DEV, knowledge-sharing connects us and drives innovation. Found this useful? A quick note of appreciation makes a real impact.

Okay