ATA over Ethernet and Network Block Device performance tests
The purpose of these tests was to test the feasibility of moving disk storage away from the computer using the storage and connect the two via ethernet on the block device level (as opposed to just using a network file system such as NFS). There are basically two Linux solutions for this: the network block device (NBD) and ATA over Ethernet (AoE). NBD exports a block device over TCP, whereas AoE functions on the link level and is originally meant for hardware targets though software targets are possible using the vblade program.
Before the test it was thought that AoE might have better performance as TCP throughput might not be very good over high-speed links. On the other hand, TCP offload engines may or may not affect TCP performance.
Setup
The tests computers were abidal (Asus L1N64-SLI WS, two dual-core AMD 2212 CPUs, 4GB RAM, 4/12 x SATA in RAID0, Nvidia gigabit ethernet, Ubuntu 7.10 amd64) as the server and juliano (Asus P5WD2-E Premium, Core 2 Duo 3.2 GHz, 2GB RAM, sky2 gigabit ethernet, Debian Etch x86) as the client. Both had a Chelsio N320E-CX dual-CX4 10gbit ethernet card. The connections, both 1gbit and 10gbit, were directly from network card to network card. Playstation 3 was also used as the client, but tests were limited to one configuration for both AoE and NBD as the PS3 ethernet chip does not support jumbo frames.
The file system exported over the network was XFS with a 4096-byte block size. It was on top of a RAID0 with 1-megabyte chunk size with 4 and 12 disks. Originally it was planned to use only the 12-disk configuration, but that was too big for a 32-bit NBD client so the same disks were used to make a smaller 4-disk RAID0 that was then used also for the other tests.
AoE exporting was done with vblade (version 14) and NBD exporting with nbd-server. The client used the NBD kernel module that came with the kernel and self-compiled AoE module version 55 (the kernel had version 32).
Tests
The test program used was bonnie++, which should give usable results about filesystem/disk sustained performance. The data size used was 20 GB to maximally reduce the effects of the buffer cache (4 GB for PS3 local disk as it has only 256 MB memory and had only about 5 GB free disk space). Only block transfers were used. The bonnie++ command line was:
bonnie++ -d . -f -s 20480
Different MTUs were used for the ethernet interfaces to see how much it affects performance. 1500 is the standard ethernet frame size, 9000 is the de-facto standard maximum jumbo frame size, 4132 (for AoE) is 4096 + 36 bytes of AoE headers, and 4608 is 4.5 kbytes for 4096 + "enough" room for headers.
AoE results
4-disk RAID0
Speed | MTU | Read MB/s | Write MB/s | Rewrite MB/s |
---|---|---|---|---|
Local disk | N/A | 288 | 256 | 106 |
1 gbps | 1500 | 47 | 23 | 23 |
1 gbps | 4132 | 106 | 114 | 51 |
1 gbps | 9000 | 107 | 120 | 54 |
10 gbps | 1500 | 100 | 24 | 53 |
10 gbps | 4132 | 231 | 248 | 112 |
10 gbps | 9000 | 257 | 82 | 113 |
12-disk RAID0
Speed | MTU | Read MB/s | Write MB/s | Rewrite MB/s |
---|---|---|---|---|
Local disk | N/A | 577 | 394 | 208 |
1 gbps | 1500 | 47 | 22 | 23 |
1 gbps | 4132 | 72 | 107 | 42 |
1 gbps | 9000 | 72 | 112 | 44 |
10 gbps | 1500 | 95 | 23 | 52 |
10 gbps | 4132 | 221 | 228 | 105 |
10 gbps | 9000 | 250 | 82 | 126 |
4-disk RAID0, PS3 client
The gigabit ethernet chip in PS3 (gelic) does not support jumbo frames.
Speed | MTU | Read MB/s | Write MB/s | Rewrite MB/s |
---|---|---|---|---|
PS3 local disk | N/A | 23 | 21 | 11 |
1 gbps | 1500 | 38 | 23 | 19 |
The AoE results are still better than the PS3 local disk, which is probably slowed down by the hypervisor guarding access to it.
NBD results
NBD does not support very large block devices on a 32-bit client, so only the 4-disk RAID0 was used for NBD tests.
4-disk RAID0
Speed | MTU | Read MB/s | Write MB/s | Rewrite MB/s |
---|---|---|---|---|
Local disk | N/A | 288 | 256 | 106 |
1 gbps | 1500 | 71 | 106 | 39 |
1 gbps | 4608 | 106 | 119 | 53 |
1 gbps | 9000 | 109 | 121 | 54 |
10 gbps | 1500 | 109 | 193 | 66 |
10 gbps | 4608 | 113 | 120 | 61 |
10 gbps | 9000 | 117 | 216 | 69 |
4-disk RAID0, PS3 client
Speed | MTU | Read MB/s | Write MB/s | Rewrite MB/s |
---|---|---|---|---|
PS3 local disk | N/A | 23 | 21 | 11 |
1 gbps | 1500 | 71 | 58 | 33 |
The PS3 ethernet performance was measured with iperf to be about 830 Mbps (TX) and 750 Mbps (RX), which gives a theoretical maximum of about 90 to 100 MB/s for data transfers.
Conclusions
The immediately obvious conclusion from the results is that jumbo frames are a must for AoE. NBD, using TCP, copes better with the standard 1500-byte frames, but it too benefits from a larger MTU. Over 1 gbps links the performance of the two seems to be on par, but NBD reads struggle over a 10 gbps link even though the Chelsio boards have TCP offloading.
AoE performance behaved a bit strangely when the local disk performance was increased by moving from a 4-disk RAID0 to a 12-disk RAID0. This may be due to the fact that vblade was used, which may introduce all kind of timing and caching issues as the data goes through the OS. (These issues may also affect the 10 gbps NBD performance.) Using actual AoE drives may give different results.
In general AoE seems to be very feasible, but performance-critical applications should be tested with the exact setup and configuration that is going to be used, as there are probably many variables (especially when using vblade) that affect the outcome. For gigabit links NBD is as good, but there are no hardware NBD targets and, using TCP, it works on a higher level in the kernel, which may not be desirable.
Juha Aatrokoski 2007-12-10