Implementation of SRv6 uSID in Telefónica VIVO’s Infrastructure
24/01/2024
By Nelson Jose dos Santos Junior, Telecom Specialist
Introduction
Technological evolution does not stop, and telecommunications networks are at the forefront of this transformation. VIVO, one of the giants in the industry, is adopting innovative strategies to optimize its operations and improve the user experience. In this blog, we will explore the implementation of SRv6 uSID, a revolutionary approach, in Telefonica VIVO’s infrastructure. This relentless evolution drives the transformation of telecommunications networks, adopting innovative strategies to enhance services and operations with a focus on user experience.
What is Segment Routing over IPv6 (SRv6)
Segment Routing over IPv6 is a routing technology that promises to simplify network architecture while providing flexibility and efficiency. Unlike traditional methods, which rely on complex routing tables, Segment Routing over IPv6, also called SRv6, encodes the entire instruction state into the IPv6 packet header, making the fabric stateless. This helps achieve better scalability and hardware efficiency. SRv6 is built on top of IPv6, simplifying the protocol stack as MPLS is no longer required.
What is EVPN (Ethernet VPN)
EVPN, or Ethernet Virtual Private Network, is a network control plane protocol designed to provide layer 2 network connectivity over a shared network infrastructure. The main function of EVPN is to provide VPN (Virtual Private Network) services for Ethernet networks.
By combining EVPN with SRv6, you can create a solution that efficiently integrates Layer 3 routing with Layer 2 connectivity over the SRv6 infrastructure. SRv6 provides a segment-based forwarding method, which is useful for optimizing IP traffic routing. When combined with EVPN, which is capable of handling Layer 2 requirements, you get a comprehensive solution for networks that require both Layer 2 and Layer 3 service functionalities.
Benefits for Telefonica VIVO
Routing Efficiency: Reduces the complexity of routing tables, improving efficiency in data transmission. SRv6 enables a simpler and more scalable network architecture, using the concept of source routing. It removes state from the network and reduces network complexity, making it easier to deploy and manage.
Low Latency: The ability to steer traffic more directly reduces latency, improving the quality of services offered.
Greater Traffic Control: Telefónica VIVO gains more control over the traffic route, allowing for a more efficient allocation of resources.
Flexibility in Network Programmability: SRv6 provides a programmable network structure, allowing for more flexibility in defining and adapting network behavior. This programmability can be beneficial for customizing the network according to the specific needs of the business, and for adapting quickly to the growing demand of new market requirements.
Improved Network Resiliency: SRv6 enables topology independent fast reroute (sub-50ms) and offers mechanisms to create highly resilient network paths. This improves the reliability and overall availability of the network infrastructure.
Load Balancing: The SRv6 solution provides native optmium load balancing, unlike MPLS, which still has issues with load balancing. In MPLS, the entropy for the Equal-Cost Multi-Path (ECMP) selection is in the internal IP packet, so routers must parse the MPLS label stack to gain access to the IP header used for hashing.
In SRv6, Ingress PE computes a hash on the customer packet and writes the result to the Flow Label field of the added external IPv6 header. The rest of the network leverages this flow label to perform ECMP selection with just a glance at the outer header, without the need to delve into the encapsulation layers.
Challenges and Solutions
Implementing such a significant change is not an easy task, both in the technical and procedural spheres, as technological changes of great relevance most of the time require restructuring in workflows, team training, commitment, etc.
In this blog, we will discuss potential obstacles and offer practical solutions, highlighting Telefónica VIVO’s commitment to operational excellence.
- Journey:
In the sections below, I will briefly decribe the Journey journey taken to achieve the deployment of SRv6 uSID in VIVO’s network.
Convinced of the need to deploy Segment Routing in the network, we started many internal discussions about implementation strategies, namely:
- Deploy SR-MPLS initially and with the future needs for scalability and network modernization, we would later migrate to SRv6;
- Implement SRv6 with no intermediate phase.
However, this journey of changing the technology two (2) times in a provider the size of Telefonica VIVO is very traumatic and costly, so we decided to skip a phase and go straight to SRv6. At this time there were many doubts about the maturity of the technology for a massive implementation in the network, but a network evolution solution for its destination point would be much more advantageous, thus we avoid double migration, non-capitalized investments associated with the underutilization of SR-MPLS investments if the migration to SRv6 happens quickly, network reconfiguration and duplication of efforts, process prolongation and more than ever technical obsolescence associated with the risk of SR-MPLS technology becoming obsolete more quickly with the advancement of SRv6.
Given the points, we began a long period of discussion with the architects of Cisco, Huawei, and Nokia seeking to ensure that the technology was mature enough for a massive deployment.
- Methodology used
The applied work methodology came from the Agile model, where we had short meetings at the first hour of the day daily with the entire project team to align points quickly in a constant manner, and once a week we met for a whole day in the Telefonica VIVO laboratory to share the achievements of the week and discuss more deeply about the problems encountered.
The figure below describes how we plan the entire development of the work.
- The Journey
This chapter presents the formal trajectory of technological evolution of VIVO’s transport network for the adoption of Segment Routing over IPv6 (SRv6), starting with the approval of a solution based on FULL Segment Identifiers (SIDs), evolving to an architecture that incorporates the use of Micro-Segment Identifiers (uSID). The uSID provides the ability to compress the SRv6 header, so with just one IPv6 header we can encode 6 uSIDs into the destination IPv6 address with no additional headers. This achieves better performance than with VxLAN or MPLS. uSID has become the industry standard solution and is backed by all vendors.
Initially, we chose to conduct extensive testing to assess the feasibility of incorporating the essential functionalities required to serve the services currently operating on VIVO’s network. The intention is to ensure that, if it is possible to meet all the requirements of the current network, we can later introduce new functionalities with the aim of improving the transport system. The approach adopted aims to carry out a gradual transition, considering that the replacement of MPLS with another technology is a delicate process. Before we head directly to the end goal, it is imperative to ensure that all existing services work in a consistent manner without disruption to customers. Therefore, we have prepared a comprehensive list of services and service scenarios, highlighting the application of SRv6 technology, as a measure to ensure the feasibility and success of the implementation.
The figure below shows the main activities carried out for the planning and execution of the FOA (First Office Application).
Homologation
Phase 1: Initial Homologation in Full SIDs
- Goals
- Evaluate the equipment compatibility and technical feasibility of the SRv6 solution.
- Approve the implementation of SRv6 with the use of FULL SIDs to ensure the understanding of the technology and its applicability in the existing infrastructure.
- Main Activities
- Equipment Selection: Specification, selection of network devices that support SRv6 and assembly of the test setup.
- Test Environment: Setting up a representative lab environment for simulation and validation of network scenarios.
- Homologation Tests: Execution of rigorous tests to verify the stability, performance and interoperability of the SRv6 solution.
- Expected Outcomes
- Validation of the SRv6 solution with FULL SIDs.
- Technical documentation of configurations, implementation procedures, and test results.
Phase 2: Homologation and Deployment in uSID (Micro-SID)
- Goals
- Integrate the uSID solution, aiming at the optimization of the address space and the simplification of the routing tables.
- Achieve greater granularity and flexibility in route management.
- Main Activities
- Implementation Planning: Development of a detailed plan for the introduction of uSID into the network.
- Equipment Update: Ensure that all network devices are updated to support uSID.
- uSID Configuration and Validation: Implementation of uSIDs in a controlled setup in the Laboratory of the transport network, followed by infrastructure and service validation tests.
Case Study: Implementation in Practice:
After the approval of the SRv6 solution, the steps for deployment include the following technical and formal activities:
- Solution Workshop for the Areas Involved:
- Conducting a workshop for all the business areas involved, in order to provide training and deep understanding of the SRv6 solution.
- Documentation Update and Presentation (Aggregation Cluster):
- Review and update of documentation and technical presentation in three levels of hierarchical detail (Aggregation Cluster)
- Software Upgrade of the Equipment of this Layer (Aggregation Cluster):
- Execution of the software upgrade process on the equipment belonging to the Aggregation Cluster layers.
- Security Policy Update:
- Review and update security policies at the layers involved.
- Adequacy of Infrastructure to Support uSID.
- Service Migration:
- Ring 1 Topology:
- Migration of L3VPN and L2VPN services – (L3VPN DualStack SRv6+MPLS).
- Migration of L3VPN and L2VPN services – (L2VPN VPLS/VLL to EVPNL2).
- Ring Topology 2:
- Migration of L3VPN and L2VPN services – (L3VPN DualStack SRv6+MPLS).
- Migration of L3VPN and L2VPN services – (L2VPN VPLS/VLL to EVPNL2).
- Ring 3 and 4 topology:
- Migration of L3VPN and L2VPN services – (L3VPN DualStack SRv6+MPLS).
- Migration of L3VPN and L2VPN services – (L2VPN VPLS/VLL to EVPNL2).
- Ring 1 Topology:
These activities represent a technical and formal framework for the successful implementation of the SRv6 solution after the homologation phase.
- Future Prospects:
The second phase of the project at VIVO consists of integrating SRv6 technology with the SDN and SDTN platforms. In addition to this integration, we will develop traffic engineering case studies and work on products using Flex-Algo. We will also look at extending SRv6 uSID to the Data Center and adding Integrated Performance Measurement (IPM) to have better visibility into the SLA offered to our customers.
- Lessons learned:
Below we can see important points that were solved during the validation in the laboratory and may be useful to other operators deploying SRv6:
- The Locator of a given network domain must always match the blocks (i.e., coherent), otherwise there will be no SRv6 programming for the service;
- It should be noted that Cisco supports a Full-SID (128-bit structure consisting of 40-bit SRv6 block, 24-bit node identifier, 16-bit function identifier, arguments, padding) or uSID. In a multi-vendor scenario with Full-SID, all instantiated SIDs must be consistent with the same SRv6 block.
- Huawei always uses an IPv6 next-hop for the BGP vpnv4, vpnv6 and EVPN address families in network scenarios with SRv6 with or without compression. We had to bring up all the BGP address families with a new IPv6 session on the Route Reflectors.
- We discovered bugs in Nokia’s 16.0.R18 release that caused major impacts on the equipment, the version was abandoned for use in the VIVO project;
- We discovered BUGs in Huawei’s V800R021 release; after enabling SRv6 the BGP session drops due to an unknown subcode in BGP-LS.
- To maintain compatibility of locator bits and function between Cisco and Nokia equipment, it was necessary to configure a dynamic MPLS range on Nokia equipment;
- In L3VPN Gateway scenarios, Nokia strongly recommends using an RT (Route-Target) per technology, otherwise it may be forwarded to elements that do not support or support SRv6, BGP parameters that can end up generating problems in the network;
- Cisco IOS XR version 7.3.2 does not make ISIS TAG native to Locators, which makes it very difficult to advertise routes using policies based on TAG tagging, version 7.7.1 or higher resolves this issue;
- The advertise-ipv6-next-hop command is necessary on Nokia equipment, but it generates a lot of unnecessary routes, we had to use policies to control the announcement of routes by reducing inflated tables;
- For services with EVPN technology, it is mandatory to have different Route-Distinguisher and type 1 <IP usually loopback>:<ID>)> RFC 7432;
- Licenses must be installed on Huawei devices to enable SRv6 functionality with and without compression;
- To work extended next-hop encoding, it is required to enable RFC 8950 in Nokia “advertise-ipv6-next-hops” and “extended-nh-encoding” (enable capability);
- In multivendor scenarios we found differences in the use of OAM: in Nokia it is possible to execute the ping command with the locator block (/64), in addition to the allocated functions/SID. In the case of Cisco and Huawei, roles/SIDs allocated “TLV 27” (END-OP. SID “sub-TLV 5”). Cisco and Huawei END-OP. SID needs to be configured manually.
On Cisco, the END-OP-SID is automatically assigned, concatenating the X:11:X Locator (HEXA); - Huawei does not support BGP transposition/update-packing;
- It was necessary to implement loop prevention mechanisms for Locators in ring topologies;
- ASBR summarizes the CORE Locators into Access and vice versa. (Intercluster operation), it also receives a summary of the blocks of each Cluster. Locally clustered access, all internal loopbacks will be disclosed and all external routes will be summarized;
- Until we have the SDN controllers ready, we choose to advertise the Locators via IGP (ISIS) and IPv6 unicast routes via BGP;
- In the SRV6 project, add-path has been added for the vpnv4/vpnv6 and EVPN families at all layers.
- We have chosen to advertise the loopback interfaces within ISIS with different metrics for IPv4 and IPv6, with the best metrics for IPv6 and the worst for IPv4, thus facilitating the process of choosing between v4 and v6 routes. This was necessary because in the Dual Connected scenario we have the same routes with next-hop IPv4 label and IPv6 uSID. Cisco always resolves the next-hop of the routes in the FIB table and does not check if there is an LSP for a given path before installing the route in the table, so sometimes the resolution of the services was resolved in the FIB with MPLS instead of SRv6;
- To adjust different IPv4 and IPv6 metrics in the ISIS protocol in the loopback interface, we had to use policies in both Nokia and Huawei, after a while, it was discovered that the algos in Huawei were not resolved, as the router-id v6 information was needed for operation. This was generated because Nokia no longer forwarded the system within the ISIS process, but via policy, which made it impossible to add the router-id v6 to the ISIS database.
- To solve the problem exposed above, we started to advertise the v4 and v6 loopbacks with the same metric value within ISIS, and Nokia started to add the system again in the ISIS process. With this configuration, the IPv4 and IPv6 loopbacks returned to have the same costs throughout the network, Nokia and Huawei have a command for the equipment to always prefer SRv6 uSID instead of MPLS, while Cisco does not, so we had to manipulate the weight attribute in BGP in the PE session with the route reflector to always prefer IPv6.
- Cisco does not yet support L3 EVPN (RT5) functionality with SRv6 uSID transport in IOS XR version 7.8, which made it impossible to move 100% of network services to EVPN technology; This is supported in later IOS XR versions, but has not been tested.
Conclusion
The implementation of SRv6 uSID represents a bold step for Telefónica VIVO, highlighting its commitment to innovation. This blog has sought to provide a comprehensive overview of this journey, from the fundamental concepts to practical application. Digital transformation is in full swing, and Telefónica VIVO is at the forefront, shaping the future of telecommunications.
If you are interested in learning more about our SRv6 deployment, you can watch my talk at the Brazilian IPv6 Forum [Brazilian IPv6 Forum] December 8, 2023 (Part 2)].
The views expressed are those of the author of this blog post and do not necessarily reflect the views of LACNIC.