Why is it worth installing a 64-bit OS on the Raspberry Pi4

image

One of the benefits of working for a software company is that you often have the opportunity to test new hardware prototypes. However, not in this case - I bought a Raspberry Pi4 because it is very cheap!

The Raspberry Pi4 has a quad-core ARM Cortex A72, up to 4 GB of memory and a gigabit Ethernet port - all for only $ 35.

On Raspberry Pi4 have OS Raspbian (based on Debian), and finished products library, so I put it in the SD-card to boot faster. I searched syslog and noticed that both the kernel and all user programs were compiled as armv7 - that is, for 32-bit memory.

I know that Raspberry Pi4 supports 64 bits, so I did not want to run a 32-bit OS on it. I took another memory card and put Debian on it. Debian, containing nothing superfluous, compiled as aarch64 - which means 64-bit memory.

Having downloaded the 64-bit OS, I became interested in how much better it works in 32-bit, so I conducted several tests.

Synthetic speed tests


The first thing that occurred to me was the old dhrystone test, which has existed since the beginning of time. This program was written in 1988, and it deals with mathematical calculations. It is unlikely to be able to simulate the current load, and we can use it only in order to maintain some kind of connectivity with old hardware and programs.



A modern digest application is better simulated by computing hashes, so I wanted to run a test with SHA1. Unfortunately, the sha1sum utility was compiled without support for libssl or cryptographic kernel functions, so I had to compile it from the source.

To avoid bottlenecks in I / O, I calculate a hash of a 2 GB file with the truncate -s 2GB option, so there was no input or output from the card:



SHA1 is a more realistic test than dhrystone, because this algorithm is used in a large number of applications - torrents, git, etc.

RAM


A 64-bit system provides access to memory of 8 bytes per read / write. I wrote a simple program that places a large buffer - she writes it, and then reads it. To guarantee real memory allocation, I used mlock (). In this test, the buffer volume is 2 GB: the 3 GB buffer worked in 64-bit mode, and in the 32-bit mode it generated an “out of memory” error.



Audio encoding


I noticed that many users of Raspberry Pi4 use a computer as a media center, so I started the task of encoding audio with the two most popular codecs.

I encoded Pink Floyd's “Echoes” composition because it is a rather long track and you can get measured values ​​from it. To avoid I / O delays, the source and destination file were stored on ramfs:





Network speed measurements


Another option for using Raspberry Pi4 is as a VPN or firewall. I do not recommend using such systems for such purposes, but many people still have a slow Internet connection (less than 100 MB), so they may not pay attention to the slow operation of the Raspberry Pi4.

First question: how much traffic can the Raspberry Pi4 handle? We need to measure the net network power of the computer, without the limitations of physical interfaces, so I started the iperf3 session between the two containers. However, containers exchange data through a pair of veths, and veth speeds up traffic through false offloads.

Unloading the calculation of the IP checksum is done simply by refusing to count it, and offloading the TCP segmentation by refusing from segmenting and reassembling traffic: a large piece of 64K data is simply transferred to memory as is.

To avoid such moments, I forbade unloading with the

ethtool command -K veth0 tx off rx off tso off gro off gso off



Firewall


The fastest network equipment is capable of - to drop part of the traffic, and the fastest way to do this is through the TC rule. In order not to reach the maximum possible speed, I used the minimum Ethernet frame size, 64b.



Although both systems did not reach the maximum transfer speed (1.5 Mb / s), the 64-bit core showed a slightly higher speed than the 32-bit one. If you want to use Raspberry Pi4 as a firewall, be sure to use the 64-bit kernel.

VPN


Another common use case for Raspberry Pi4 is a VPN server, or rather, OpenVPN. I prefer WireGuard, so I checked both programs because they are both easy to install:



As expected, OpenVPN is 10 times slower than WireGuard. What was not expected was that OpenVPN works at the same speed at 32 and 64 bps. WireGuard almost saturates the gigabit port in both cases - perhaps we have reached the NIC limit.

To find out if WireGuard could work even faster, I ran another test with two containers that didn't use physical Ethernet. The only problem was that both the client and the iperf3 server were running on the Raspberry Pi4, loading two cores.



As expected, OpenVPN and 32-bit WireGuard, limited by the CPU, performed worse, and 64-bit WireGuard did better.

Conclusions


I often read statements like “it's not worth it”, “you will win a few milliseconds”, etc., simply because the Raspberry Pi4 is not a very powerful computer. This is not true! As any person involved in embedded equipment knows, on slow hardware software optimization is even more important than on fast hardware.

I already knew that a 64-bit OS would work better on the Raspberry Pi4, but I did not know how much better. So I did all these tests. I hope you enjoyed it!

All Articles