Istio Service Mesh

The service mesh takes on the responsibility of making service communication resilient to failures by implementing capabilities like retries, timeouts, and circuit breakers. It’s also capable of handling evolving infrastructure topologies by handling things like service discovery, adaptive and zone-aware load balancing, and health checking. Since all of the traffic flows through the mesh, operators can control and direct traffic explicitly. For example, if we want to deploy a new version of our application, we may want to expose it to only a small fraction, say 1%, of live traffic. With the service mesh in place, we have the power to do that. Of course, the converse of control in the service mesh is understanding its current behavior. Since the traffic flows through the mesh, we’re able to capture detailed signals about the behavior of the network by tracking metrics like request spikes, latency, throughput, failures, etc. We can leverage this telemetry to paint a picture of what’s actually happening in our system. Lastly, since the service mesh controls both ends of the network communication between applications, it can enforce strong security like transport-level encryption with mutual TLS. The service mesh provides all of these capabilities to service operators with very little to no application code changes, dependencies, or intrusions. For some of the capabilities,

there will have to be minor cooperation with the application code, but we can avoid large complicated library dependencies. With a service mesh, it doesn’t matter what application framework or programming language you’ve used to build your application; these capabilities are implemented consistently and correctly. The capabilities that service mesh provides allow our service teams to move quickly, safely, and confidently when implementing and delivering changes to their systems to test their hypotheses and deliver value.