Experience in implementing network factories based on EVPN VXLAN and Cisco ACI and a small comparison


Rate the ligaments in the middle of the chart. We will return to them below.

At some point, you may encounter the fact that large complex networks based on L2 are terminally ill. First of all, problems associated with the processing of BUM traffic and with the operation of the STP protocol. The second - as a whole morally obsolete architecture. This causes unpleasant problems in the form of downtimes and inconvenience to manageability.

We had two parallel projects where customers soberly evaluated all the pros and cons of the options and chose two different overlay solutions, and we implemented them.

It was possible to compare exactly the implementation. Not exploitation, it’s worth talking about it in two or three years.

So what is an overlay network factory and SDN?

What to do with the sore problems of the classical network architecture?


Every year new technologies and ideas appear. In practice, the hot need to rebuild networks has not arisen for quite some time, because you can also do everything by hand using the good old grandfather methods. So what about the twenty-first century? In the end, the admin should work, not sit in his office.

Then the boom of building large-scale data centers began. Then it became clear that the limit of development of classical architecture was reached not only in terms of performance, fault tolerance, scalability. And one of the options for solving these problems was the idea of ​​building superimposed networks on top of the routed backbone.

In addition, with the increasing scale of networks, the problem of managing such factories became acute, as a result of which solutions of software-defined networks began to appear with the ability to manage the entire network infrastructure as a whole. And when the network is managed from a single point, it is easier for other components of the IT infrastructure to interact with it, and such interaction processes are easier to automate.

Almost every major manufacturer of not only network equipment, but also virtualization, has in its portfolio options for such solutions.

It remains only to figure out what is suitable for what needs. For example, for especially large companies with a good development and operation team, vendor-based box solutions do not always satisfy all needs, and they resort to developing their own SD (software defined) solutions. For example, these are cloud providers that are constantly expanding the range of services provided to their customers, and boxed solutions are simply not able to keep up with their needs.

For medium-sized companies, the functionality offered by the vendor in the form of a boxed solution is enough in 99 percent of cases.

What is Overlay Networking?


What is the idea of ​​overlay networks? In essence, you take a classic routed network and build another network on top of it to get more features. Most often, we are talking about the effective distribution of the load on equipment and communication lines, a significant increase in the scalability limit, increased reliability and a bunch of security buns (due to segmentation). And SDN solutions in addition to this provide an opportunity for a very, very, very convenient flexible administration and make the network more transparent for its consumers.

In general, if local networks were invented in the years of the 2010s, then they would not look like the ones that we inherited from the military from the 1970s.

From the point of view of technologies for building factories using superimposed networks, there are currently many implementations of manufacturers and Internet projects RFC (EVPN + VXLAN, EVPN + MPLS, EVPN + MPLSoGRE, EVPN + Geneve and others). Yes, there are standards, but the implementation of these standards by different manufacturers may differ, therefore, when creating such factories, it is still possible to completely abandon vendor-lock only in theory on paper.

With the SD solution, things are even more confusing, each vendor has its own vision. There are completely open solutions that, in theory, you can finish yourself, there are completely closed ones.

Cisco offers its own SDN option for data centers - ACI. Naturally, this is a 100% vendor-lock solution in terms of the choice of network equipment, but at the same time it is fully integrated with virtualization, containerization, security, orchestration, load balancers, etc. But in essence it is still a kind of black box, without the possibility of full access to all internal processes. Not all customers agree to this option, since you completely depend on the quality of the written solution code and its implementation, but on the other hand, the manufacturer has one of the best technical support in the world and has a dedicated team that deals only with this solution. It was Cisco ACI that was chosen as the solution for the first project.

The solution for Juniper was chosen for the second project. The manufacturer also has its own SDN for the data center, but the customer decided to abandon the implementation of SDN. The EVPN VXLAN factory without the use of centralized controllers was chosen as the technology for building the network.

Why do you need


Creating a factory allows you to build an easily scalable, fault-tolerant, reliable network. The architecture (leaf-spine) takes into account the features of data centers (traffic paths, minimizing delays and network bottlenecks). SD solutions in data centers make it possible to conveniently, quickly, flexibly manage such a factory and integrate it into the data center ecosystem.

Both customers needed to build backup data centers to ensure fault tolerance, in addition, traffic between the data centers had to be encrypted.

The first customer already considered solutions without a factory as a possible standard for their networks, but in tests they had problems with STP compatibility between several hardware vendors. There were downtimes that caused service drops. And for the customer it was critical.

Cisco was already the corporate standard of the customer, they looked at ACI and other options and decided that it was worth taking this solution. I liked the automation of control with one button through a single controller. Services are configured faster, managed faster. They decided to provide traffic encryption by running MACSec between the IPN and SPINE switches. Thus, we managed to avoid a bottleneck in the form of a crypto-gateway, save on them and use the maximum bandwidth.

The second customer chose a solution without a controller from Juniper, because in their existing data center there was already a small installation with the implementation of the EVPN VXLAN factory. But there it was not fault-tolerant (one switch was used). They decided to expand the infrastructure of the main data center and build a factory in the backup data center. The existing EVPN was not fully used: VXLAN encapsulation was not actually used, since all hosts were connected to one switch, and all MAC addresses and / 32 host addresses were local, the same switch was a gateway for them, there were no other devices, where it was necessary to build VXLAN tunnels. They decided to provide traffic encryption using IPSEC technology between firewalls (ITU performance was sufficient).

They also felt ACI, but decided that because of the vendor lock they would have to buy too much iron, including replacing recently purchased new equipment, and this just does not make economic sense. Yes, the Cisco factory integrates with everything, but only its devices are possible inside the factory itself.

On the other hand, as they said earlier, EVPN VXLAN factory just can’t mix with any neighboring vendor, because protocol implementations are different. It's like crossing Cisco and Huawei on the same network - it seems that the standards are common, only you have to dance with a tambourine. Since this is a bank, and compatibility tests would be very long, they decided that it is better to purchase the same vendor now, and not really get carried away with functionality outside the base.

Migration plan


Two ACI-based data



centers : Organization of interaction between data centers. Multi-Pod solution selected - each data center is a hearth. The requirements for scaling in the number of switches and for delays between the hearths (RTT less than 50 ms) were taken into account. It was decided not to build a Multi-Site solution for ease of management (for a Multi-Pod solution, a single management interface is used, for a Multi-Site there would be two interfaces, or a Multi-Site Orchestrator would be required), and since geographic reservation of sites was not required.



From the point of view of migration of services from the Legacy network, the most transparent option was chosen, to gradually transfer VLANs corresponding to certain services.
For migration, each VLAN was created with the corresponding EPG (End-point-group) in the factory. First, the network was stretched between the old network and the factory via L2, then after all the hosts were migrated, the gateway was transferred to the factory, and the EPG interacted with the existing network through L3OUT, and the interaction between L3OUT and EPG was described using contracts. Sample schema: The



sample structure of most ACI factory policies in the figure below. The whole configuration is based on policies embedded in other policies and so on. At first it’s very difficult to figure it out, but gradually, as practice shows, network administrators get used to such a structure in about a month, and then only comes to an understanding of how convenient it is.



Comparison


In the Cisco ACI solution, you need to buy more equipment (separate switches for Inter-Pod interaction and APIC controllers), which made it more expensive. Juniper's solution did not require the purchase of controllers and accessories; It turned out to partially use the existing customer equipment.

Here is the architecture of the EVPN VXLAN factory for two data centers of the second project:




In ACI you get a ready-made solution - no need to pick, no need to optimize. At the initial acquaintance of the customer with the factory, developers are not needed; supportive people are not needed for code and automation. It’s quite simple to operate, many settings can be done generally through the wizard, which is not always a plus, especially for people who are used to the command line. In any case, it takes time to rebuild the brain on new tracks, on the feature of settings through policies and on the operation of many policies nested in each other. In addition, it is very desirable to have a clear structure for the names of policies and objects. If there is any problem in the logic of the controller, it can only be solved through technical support.

In EVPN, a console. Suffer or rejoice. The familiar interface for the old guard. Yes, there is a typical configuration and guides. Have to smoke mana. Different designs, everything is clear and detailed.

Naturally, in both cases it is better to migrate not the most critical services, for example, test environments, during migration, and only then, after catching all the bugs, proceed to production. And do not tune in on Friday night. You should not trust the vendor that everything will be ok, it is always better to play it safe.

You pay more on ACI, although Cisco is currently actively promoting this solution and often gives good discounts on it, but you save on maintenance — maintenance. Management and any automation of an EVPN factory without a controller requires investments and regular expenses - monitoring, automation, implementation of new services. At the same time, the initial launch at ACI is 30-40 percent longer. This is because the whole set of necessary profiles and policies is created longer, which will then be used. But as the network grows, the number of required configurations decreases. Use the already created policies, profiles, objects. You can flexibly configure segmentation and security, centrally manage contracts that are responsible for allowing certain interactions between EPGs - the volume of work drops sharply.

In EVPN, each device in the factory must be configured, the probability of error is greater.

If ACI is slower to deploy, then EVPN has been debugging almost twice as long. If in the case of Cisco you can always call a support engineer and ask about the network as a whole (because it is covered as a solution), then from Juniper Networks you buy only hardware, and it is covered by it. Are the packets from the device gone? Well ok, then your problems. But you can open the question of choosing a solution or designing a network - and then they will advise you to purchase a professional service, for an additional fee.

ACI support is very cool because it is separate: a separate team sits just for this. There are, including Russian-speaking specialists. Hyde detailed, decisions are predetermined. They look and advise. Design validates quickly, which is often important. Juniper Networks are doing the same thing, but at times slower (we had it, it should be better now, it’s rumored), which forces you to do everything yourself where your solution engineer could advise.

Cisco ACI supports integration with virtualization and containerization systems (VMware, Kubernetes, Hyper-V) and centralized management. There are network and security services - balancing, firewalls, WAF, IPS and more ... Good micro-segmentation out of the box. In the second solution, integration with network services is done with a tambourine, and it is better to smoke forums with those who did this in advance.

Total


For each specific case, it is necessary to select a solution, not only based on the cost of the equipment, but also need to take into account further operating costs and the main problems that the customer is facing now, and what are the plans for the development of IT infrastructure.

ACI due to additional equipment came out more expensive, but the solution is ready without the need for finishing, the second solution is more complicated and costly from the point of view of operation, but cheaper.

If you want to discuss how much it can cost to implement a network factory on different vendors, and what kind of architecture is needed, you can meet and talk. Before the rough draft of architecture (with which budgets can be considered) we will show you for free, a detailed study, of course, is already paid.

Vladimir Klepce, corporate networks.

All Articles