Beliebte Suchanfragen
//

From sidecars to sidecarless: Tracing the evolution of service mesh technologies with Istio and Cilium

22.5.2024 | 10 minutes of reading time

Ever wondered how the technology that seamlessly manages microservices traffic evolved from early implementations to lean, kernel-level solutions? Let's dive into the fascinating journey of service meshes, from Linkerd 1.x to the cutting-edge technologies Istio Ambient and Cilium.

Running microservices-based architectures, especially in cloud-native environments, presents numerous challenges. Issues such as traffic management, resilience, observability, security, and access control are critical to address. This is where the concept of a service mesh comes into play, providing essential capabilities that translate into significant business value.

A brief history

The emergence of the first "service mesh" marked a significant milestone in the evolution of microservices architectures. Linkerd 1.x, introduced in April 2017, paved the way for managing service-to-service communication in complex distributed systems. Despite its groundbreaking capabilities, Linkerd 1.x encountered several challenges that shaped the subsequent evolution of service mesh technology. Linkerd 1.x was built on Java, relying on the JVM for its runtime environment. While Java offered flexibility and compatibility, its memory-intensive nature made it challenging to size properly, leading to potential resource inefficiencies. Deployed as a Kubernetes DaemonSet, Linkerd 1.x operated on a one-proxy-per-node basis. While this approach provided uniformity and simplicity in deployment, it also introduced the issue of noisy neighbors. With each node hosting its own proxy, resource contention and performance bottlenecks could arise, impacting the overall system performance.

The first sidecar (proxy)

As service mesh technologies evolved to address challenges like the "noisy neighbor" issue, a pivotal shift occurred towards sidecar-based architectures. This marked a significant departure from previous deployment models, offering a more granular and efficient approach to managing service-to-service communication. Moving networking functionality closer to the application with sidecar proxies enabled each service to have its own networking counterpart. This approach facilitated more seamless communication between services while mitigating the impact of noisy neighbors. Sidecar proxies operated as transparent intermediaries, seamlessly intercepting and managing traffic between services without requiring changes to the application code. By being part of the application's lifecycle, sidecar proxies ensured consistent deployment and management alongside application instances. Sidecar proxies offered a single-tenant environment for each application, reducing the blast radius of potential failures and enabling more efficient resource sizing.

While sidecar proxies became the de facto standard for service meshes, they introduced their own set of challenges and considerations:

  • Increased costs: Deploying sidecar proxies incurred additional resource overhead and operational costs, particularly in environments with a large number of microservices.
  • Race conditions: Managing concurrent traffic and interactions between sidecar proxies could lead to race conditions and potential performance bottlenecks.
  • Maintenance and upgrades: Upgrading sidecar proxies could be challenging, as all instances needed to restart simultaneously, potentially impacting service availability and performance.

Introducing the CNI

As service mesh technologies continue to evolve, questions arise about the most efficient and cost-effective way to provide essential capabilities such as access control and traffic management. One alternative approach gaining traction is leveraging Container Network Interface (CNI) solutions, which operate at lower network layers compared to traditional HTTP proxies. CNIs operate at layers 3 and 4 of the OSI model, enabling them to handle network-related tasks such as routing, packet filtering, and load balancing. This lower-level integration offers potential performance and efficiency benefits compared to HTTP proxies operating at layer 7. In Kubernetes environments, CNIs play a crucial role in enabling networking between containers and pods. They are responsible for configuring network interfaces, setting up IP addresses, and managing network policies to control communication between pods.

Understanding eBPF

eBPF (extended Berkeley Packet Filter) represents a groundbreaking advancement in kernel-level technology, offering powerful capabilities for network management and performance optimization. eBPF builds upon the foundation of Berkeley Packet Filter (BPF), a kernel-level technology initially developed for packet filtering. While BPF provided a solid framework for filtering network packets, eBPF expands its capabilities to include dynamic code execution and event-driven processing. Unlike its predecessor, which was limited to kernel-level code execution, eBPF operates from user space, enabling a wide range of applications beyond packet filtering. This versatility has transformed eBPF into a versatile system-level extension technology with applications in networking, security, and observability. eBPF programs are event-driven and react to specific trigger points within the kernel or application processes. These trigger points, known as hook points, include network events such as packet reception or transmission. Despite running from user space, eBPF programs execute within the kernel environment, allowing for seamless integration with kernel functions and data structures. This unique architecture enables eBPF to achieve unprecedented levels of performance and efficiency.

Exploring service mesh in the kernel

The idea of embedding service mesh capabilities directly within the kernel environment opens up new possibilities for optimizing network management and communication in microservices architectures. While sidecar proxies have been instrumental in enabling service mesh functionality, they introduce overhead and complexity due to their deployment model and resource requirements. This prompts the exploration of alternative approaches such as integrating service mesh functionality directly into the kernel. Kube-Proxy, the internal proxy used in Kubernetes clusters, can be considered an early form of service mesh within the kernel. However, its reliance on iptables for traffic management limits its capabilities compared to modern service mesh solutions.

Cillium, eBPF-powered Kubernetes CNI

Cilium, powered by eBPF technology, represents a groundbreaking advancement in Kubernetes networking and service mesh solutions. Developed by Isovalent and now part of IBM, Cilium offers a wide range of features and capabilities for managing network communication and security in Kubernetes environments. Cilium leverages eBPF technology to provide high-performance, low-overhead networking capabilities in Kubernetes clusters. By operating at the kernel level, Cilium offers efficient packet processing and fine-grained control over network traffic. As a CNI, Cilium offers a comprehensive feature set for managing networking tasks, including layer-4 load balancing, BGP routing, Egress control, and even replacing Kube-proxy. This versatility makes Cilium a powerful choice for networking in Kubernetes environments. In addition to networking functionality, Cilium provides robust observability and security features. This includes metrics collection, distributed tracing, service mapping, encryption, and network policies, enhancing visibility and control over cluster operations. In mid-2022, Cilium expanded its capabilities to include service mesh functionality, marking a significant milestone in its development. The introduction of the "Cilium Service Mesh" represents a shift towards sidecar-less service mesh architectures, leveraging the power of eBPF for transparent and efficient communication between services.

Exploring sidecarless service mesh

A sidecar-less service mesh represents a paradigm shift in how organizations approach microservices communication and management, offering a more streamlined and efficient solution compared to traditional sidecar-based approaches.

Unlike traditional service mesh architectures that rely on sidecar proxies for communication between services, a sidecar-less service mesh operates without the need for additional proxy containers. Instead, it leverages technologies such as eBPF to implement networking functionality directly within the kernel environment. By leveraging eBPF at the kernel level, a sidecar-less service mesh offers full transparency and efficiency in service communication. Without the overhead of sidecar containers, services can communicate directly through optimized networking paths, reducing latency and resource consumption.

It doesn't work without Sidecars Proxies

While sidecar-less service mesh architectures offer significant benefits in terms of efficiency and resource optimization, they come with inherent limitations, particularly in addressing HTTP-specific functionalities that are traditionally handled by layer 7 proxies.

FunctionalityNetworking Layer
Traffic Management (Load Balancing, retry, timeout, circuit breaking, JWT validation, traffic splitting, mirroring...)L7
Request Level Authorization (Headers, JWT Claims, Path, Rate Limit..)L7
Request Observability (Request count, 500x status, latency, sizes)L7
mTLS, protocol / port authorization, source / dest authorizationL4
Connection observabilityL4
Network PoliciesL4/L3

While sidecar-less implementations excel at handling tasks at layers 3 and 4 of the OSI model, they fall short in addressing the intricate requirements of layer 7 functionalities. This creates a coverage gap, leaving critical aspects of microservices communication unaddressed. Organizations opting for sidecar-less service mesh architectures must carefully consider the trade-offs involved. While they benefit from reduced resource overhead and simplified deployment, they may sacrifice certain HTTP-specific functionalities and granular control over application-layer communication.

Exploring Istio Ambient – the "hybrid"

Istio Ambient represents a significant evolution in Istio's architecture, introducing a hybrid approach that combines elements of traditional sidecar-based service mesh with sidecar-less principles. Istio Ambient mode marks a pivotal shift in Istio's design philosophy, offering a simplified operational model and introducing a lightweight shared node proxy model. The recent graduation of Istio Ambient to Beta signifies its readiness for wider adoption, showcasing Istio's commitment to embracing innovative approaches in service mesh architecture. Istio Ambient comprises several key components that collectively enable its hybrid service mesh architecture:

  • Z-Tunnels: Serving as the backbone of Istio Ambient's data plane, Z-Tunnels facilitate secure communication and authentication between workloads in the mesh. Additionally, they provide essential functions such as mTLS, authentication, and telemetry.
  • Istio CNI: In Ambient mode, Istio CNI configures traffic redirection to Z-Tunnels using eBPF, simplifying network management and enhancing performance.
  • As a Layer 7 Envoy Proxy per workload, the Waypoint Proxy complements the functionality of Z-Tunnels by offering HTTP-specific capabilities. It serves as a second component alongside Z-Tunnels, providing enhanced layer-7 features when necessary.

The hybrid nature of Istio Ambient is characterized by its ability to dynamically adapt to workload requirements, seamlessly transitioning between layer 4 and layer 7 functionalities as needed.

Work with and not against each other

While people tend to believe that you can either use a CNI such as Cilium or a Service Mesh such as Isto Ambient, in reality you should use them together. Cilium excels at addressing lower-level networking requirements, providing robust solutions for traffic management, network policy enforcement, and security. By leveraging Cilium's capabilities, organizations can establish granular control over network traffic and ensure compliance with security policies. For example, Cilium's network policies enable organizations to define access control rules based on various criteria such as endpoint labels, ports, and protocols. These policies offer a flexible and scalable approach to network segmentation and isolation, ensuring that only authorized services can communicate with each other.

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "l3-rule"
spec:
  endpointSelector:
    matchLabels:
      role: backend
  ingress:
  - fromEndpoints:
    - matchLabels:
        role: frontend
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "l4-rule"
spec:
  endpointSelector:
    matchLabels:
      app: myService
  egress:
    - toPorts:
      - ports:
        - port: "80"
          protocol: TCP

Cilium also offers matching on l7 capabilities, but it will also use an embedded (or stand-alone per node) envoy proxy on those scenarios.

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "l7-rule"
spec:
  endpointSelector:
    matchLabels:
      app: myService
  ingress:
  - toPorts:
    - ports:
      - port: '80'
        protocol: TCP
      rules:
        http:
        - method: GET
          path: "/path1$"
        - method: PUT
          path: "/path2$"
          headers:
          - 'X-My-Header: true'

On the contrary, Istio's authorization policies provide additional layers of security by enforcing access control at the application layer. By defining policies based on Spiffe and Spire identities, organizations can restrict access to sensitive resources and prevent unauthorized interactions between services.

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: httpbin
  namespace: foo
spec:
  action: DENY
  rules:
  - from:
    - source:
        principals:
        - cluster.local/ns/default/sa/my-service-account
  - to:
    - operation:
        methods: ["POST"]
        ports: ["8080"]

Conclusion

The evolution of service meshes represents a transformative journey in microservices networking, offering unprecedented capabilities to modernize and optimize cloud-native architectures. As organizations embark on this journey, it's essential to approach tool selection with careful consideration, recognizing that there is no one-size-fits-all solution.

In navigating the complexities of cloud-native networking, I advocate for a holistic approach that embraces diversity in tooling. For organizations seeking comprehensive networking solutions, the integration of Cilium as a CNI and Istio Ambient as a service mesh exemplifies the power of complementary technologies. Cilium excels at addressing low-level networking requirements, while Istio Ambient provides advanced service mesh capabilities, creating a synergistic relationship that enhances overall network resilience and scalability. However, the decision of which tools to pick ultimately depends on a myriad of factors, including workload characteristics, performance requirements, compliance standards, and operational preferences.

share post

//

More articles in this subject area

Discover exciting further topics and let the codecentric world inspire you.

//

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.