« www.tkk.fi

Suomeksi | På svenska | In English | GER | CAT | SPA | EPO | IT | Old

Siirry sivun sisällön alkuun





Lab notes on 10 Gbit/s network tests
(C) 2007 Jan Wagner, Guifre Molera


27 June 2008 - SATA Port Multiplier (PM) tests.

A single 750GB Spin Point F1 disk behind the port multiplier and the ADSA3GPX8-4E PCI-Express SATA controller, give an hdparm performance of 72 MB/s.

SATA portPCI-Expressnative
hdparm single disk72 MB/S72.4 MB/s
wr-nexgen single disk611 Mb/s611 Mb/s
wr-nexgen multiple disks1620 Mb/s2337 Mb/s
hdparm RAID232 MB/s286 MB/s
wr-nexgen RAID1818 Mb/s2341 Mb/s

01 July 2008 - SATA controller tests

By default the SATA native ports cannot detect the port multiplier, probably updating bios or installing the latest sata_nv drivers might help. So the tests continued by using the SATA host controller, driver sata_sil24. Detection is automatic after booting the pc. All data from new Samsung 750 GB has been erased so we could run same tests by using them. Past tests showed much difference between old and new disks connected to the native SATA ports.

ModePCI-E raid-disks=5PCi_E raid-disks=4
hdparm -t /dev/md0305.9 MB/S294 MB/s
wr-nexgen RAID disk2355 Mb/s2453 Mb/s
wr-nexgen to multiple disks2379 Mb/s2505 Mb/s

As seen the performance of 4 disks is higher than adding an extra disk to the PMP. Probably the chip cannot handle correctly more than 4 disks.

21 July 2008 - SATA controller tests

./wr-nexgen 750000000 32768 1000000 32768 1 /dev/md0 1000000

HP SC44Ge / LSISAS1068E controller tests - all 4 int + 4 ext SATA disks are detected, provided that the correct SFF multi lane to SATA converter is used. Turns out port multipliers did not work, it still sees only one disk per PMP. Log files. Performance with 10 disks configured together with 6 disks behind HP and 4 behind nForce680i native: 2780 Mbps average. With 8 disks behind HP in hardware RAID, performance around is 1.8 Gbps.

24 July 2008 - SATA controller tests

Abidal nForce 680a was used for the following tests. Connected two Addonics PMs via the Addonics ADSA3GPX8-4E 4xeSATA card in a PCIe x8 slot. Doing some cross-set testing over different ports, using wr-nexgen with 32kB block 128MB RAM 20312.50MB file to raw /dev/md0 and finally to XFS filesys. For wr-nexgen tests see log file. Results:

eSATA port #1/4
  |
eSATA port #3/4
sdb1sdc1sdd1sde1   |  sdf1sdg1sdh1sdhi1
Took 248.825253 seconds, 684.792230 Mbits(dec)/s : /dev/sdX : log

sdb1sdc1sdd1sde1   |  sdf1sdg1sdh1sdhi1
Took 242.481281 seconds, 702.708264 Mbits(dec)/s : /dev/sdX : log

sdb1sdc1sdd1sde1   |  sdf1sdg1sdh1sdhi1
Took 122.601060 seconds, 1389.821590 Mbits(dec)/s : /dev/md0 : log

sdb1sdc1sdd1sde1   |  sdf1sdg1sdh1sdhi1
Took 276.815855 seconds, 615.548557 Mbits(dec)/s : /dev/sdX : log

sdb1sdc1sdd1sde1   |  sdf1sdg1sdh1sdhi1
Took 119.279701 seconds, 1428.52154 Mbits(dec)/s : /dev/md0 : log

sdb1sdc1sdd1sde1   |  sdf1sdg1sdh1sdhi1
Took 82.755253 seconds, 2059.006458 Mbits(dec)/s : /dev/md0 : log

sdb1sdc1sdd1sde1   |  sdf1sdg1sdh1sdhi1
Took 82.314915 seconds, 2070.020963 Mbits(dec)/s : /dev/md0 : log

sdb1sdc1sdd1sde1   |  sdf1sdg1sdh1sdhi1
Took 85.464052 seconds, 1993.745863 Mbits(dec)/s : /raid XFS : log

It is curious to see ADSA3GPX8-4E performance level off at around 2 Gbps. Perhaps the PCI-X to PCIe bridge on the card causes the low performance. Or maybe the eSATA ports can perform only at 1 Gbps each. This can be checked later by connecting 4 instead of 2 PMs to the eSATA ports and distributing the same 8 disks over the 4 PMs. Update: we re-ran the last test and managed to get around 4 gbps, when writing to a raw /dev/md0 that was not xfs-formatted and not mounted

Some figures , ,

Some thoughts for 4 Gbps: the 20 GByte file at an 4 Gbps externally pushed rate means 40 seconds at 512 MB/sec. Abidal has 3800 MB free RAM, about 4 seconds if the OS takes other half for buffering. So wr-nexgen output should show current interval-average rates in <2 second intervals. If any of the <2s interval-average reported rate drops below 4 Gbps then 4G network to disk recording will not work without loss.

25 July 2008 - SATA controller tests

Most of the previews tests have been using 2, 4 or 8 disks. Follows tests have been trying to push more than 4 Gbps to a larger devices. As we don't have yet more than 2 PMP compatible with the PCI-E Host controller the disks has been connected to the mobo native ports, PCI-E host controller and/or HP subsequently. The follows table shows number of disks, kind of connection used and rates achieved:

hdparmwr-nexgen
8*native 425 MB/s
8*native + 4*PCI-e 413 MB/s4535 Mbits/s
4*native + 4*PCI-e 428 MB/s3800 Mbits/s
8*PCI-e 455 MB/s3815 Mbits/s
4*native + 8*PCI-e 440 MB/s4719 Mbits/s
8*native + 8*PCI-e 481 MB/s4600 Mbits/s

An attempt to use 16 disks: 8 connected to PCI-E card and 8 to the native port didn't clear solve the problem. The speed is still a bit over 4 Gbps, but far from teoritecally 8 Gbps. Total capacity of the raid is 10.2 TB.

05 August 2008 - RocketRaid 2522

Got a HighPoint RocketRaid 2522 (2 x miniSAS, ads claim PMP support up to 40 disks). First impressions: the Linux driver seems to come in mixed binary and source format, modinfo rr2522 shows license as Proprietary. The Linux side sees no individual disks till now. But it sees the RAID array(s) configured in the RocketRaid BIOS. PMP support works as advertised, the RocketRaid bios detects all disks behind a PMP.

For the first test, we used one mini-SAS to SFF cable connected to an AD4SAML (SFF-to-SATA). Each of these four SATA ports was hooked to one AD5SAPM 5-port PMP. Eleven (11) SpinPoint F1 750BG disks were available. The theoretical peak rate would be 100MB/s*8bit*11 = 8800 Mbit/s. For assumed best per-link bandwidth utilization, the disks were scattered over the four PMPs (theoretical rate 4 x 300MB/s*8bit = 9600 Mbit/s) in a 3 + 3 + 3 + 2 configuration. The disks were combined in HighPoint BIOS into one single RAID-0. The chunk size was not configurable nor displayed.

The quick test hdparm -t /dev/sdb : 190 MB/sec doesn't give the full picture. With wr-nexgen writing a file to an xfs-formatted /dev/sdb:/raid/testfile in 32kB blocks from a 1024MB RAM buffer, the write rate (measured in ~30 second intervals) hovers between 3893 Mbit/s and 4093 Mbit/s. CPU core loads were quite varying, 90%:25%:20%:20%sy (155%sy) typical peak and minimum 35%:30%:30%:30%sy (125%sy).

With a wr-nexgen writing onto the raw blank /dev/sdb device, the rate was initially between 4078 Mbit/s and 4966 Mbit/s, but then had long dips down to 3500 Mbit/s. CPU core loads were quite steady, 90%:43%:25%:0%sy (158%sy).

The second test: 12 disks, 3 behind each of the 4 PMPs. A hardware RAID-0 was created for each 3-disk PMP. These were then combined into a mdadm md0 with chunk size 512k. The wr-nexgen write rate to raw /dev/md0 improved marginally. Peak 4900 Mbit/s, low 4260 Mbit/s. CPU core loads were quite steady, 85%:50%:25%:0%sy (160%sy), and the 25% CPU has additional 15%hi 20%si loads. With chunk size 64k the performance is less, already at the beginning only 3700 Mbit/s with 50%:35%:15%:0%sy (100%sy) CPU. Using a 2048k chunk size and performing 256kB writes, the rate is 5090 Mbit/s peak 4540 Mbit/s low, with steady CPU 90%:40%:25%:0% (155%sy), plus the 25% CPU has 15%hi 19%si. Writing to XFS on the same raid gives 4450 Mbit/s peak 3960 Mbit/s low, highly varying CPU 75%:40%:25%:25%sy (165%sy) and one CPU additionally 8%hi 2%si.

The third test: same 12 disks in 3 disks/PMP grouping, but only one pair of PMPs is taken into software RAID i.e. just 6 disks. Chunk size 1024k. Curiously this 2x3-disk write rate is 3400 Mbit/s peak 3130 Mbit/s low. Yet earlier with the 4x3-disk and 1x12-disk (1x11...) setups the rate is not even close to double of this.

One possibility for the low rate is that, since the RocketRaid 2522 has two mini-SAS connectors each with its own PCI-X controller and we are using only one of the miniSAS, the on-board PCI-X bus or CPU may be saturated at around 4-5 Gbps.

Still, around 4300 Mbit/s to XFS is nearly possible, using 4 3-disk hardware RAIDs combined into 1 software RAID with 1024/2048k chunk size and with large (256kB) writes.

RocketRaid configwr-nexgen Mbit/s peakwr-nexgen Mbit/s lowAverage
theoretical 12 disk perf100MB*12 = 9600 Mbit/s4 SATA * 300MB/s = 9600 Mbit/s
11 disks, 4 PMP 3+3+3+2, 1 HW RAID, /dev/sdb4093 Mbit/s XFS
4966 Mbit/s raw
3893 Mbit/s XFS
3500/4078 Mbit/s raw
12 disks, 4 PMP 3+3+3+3, 4 HW RAIDs, /dev/md0 512k4900 Mbit/s raw4260 Mbit/s raw
12 disks, 4 PMP 3+3+3+3, 4 HW RAIDs, /dev/md0 64k3700 Mbit/s raw...
12 disks, 4 PMP 3+3+3+3, 4 HW RAIDs, /dev/md0 2048k5090 Mbit/s raw if 256k writes, 4450 Mbit/s XFS in 256k writes4540 Mbit/s raw 256k,
3960 Mbit/s XFS in 256k writes
6 disks, 4 PMP 3+3+*+*, 2 HW RAIDs, /dev/md0 1024k3400 Mbit/s raw3130 Mbit/s raw
5 disks, 1 PMP, 1 HW RAID, /dev/sdb1856 Mbit/s raw1736 Mbit/s raw1794Mbps
4 disks, 1 PMP, 1 HW RAID, /dev/sdc1942 Mbit/s raw1644 Mbit/s raw1771Mbps

Log files are in 05082008.

06 August 2008

ADSA3GPX8-4E 4xeSATA

Plugged in the ADSA3GPX8-4E 4xeSATA PCIe into Abidal nForce680i. The SATA PMPs could be connected with eSATA-to-SATA converter cables. Replaced the old wr-nexgen on Abidal with the newer wr-nexgen.c that reports the rate in 0.5s intervals and has xfs real time support. Updated the disktune.sh script: can disable NCQ, disable fancy drive I/O schedulers (CFQ->NOOP), tune sector/write size, and other interesting things. In the end however all the tunings essentially gave zero benefit in write performance.

ConfigurationMbit/s(dec)Mbit/s(dec)Mbit/s(dec)Write configCtrl-C'ed after
2 PMP : 5+5 disks3734 avg3843 max3733 min32k writes93s
3 PMP : 5+5+24374 avg5110 max3962 min32k writes566s
3 PMP : 5+5+24521 avg5490 max4313 min128k writes142s
3 PMP : 5+5+24553 avg5335 max4402 min128k writes, NOOP scheduler107s
3 PMP : 4+4+44717 avg5285 max3939 min128k writes1778slog plot
3 PMP : 4+4+43985 avg4403 max2615 min128k writes, XFS507slog plot
4 PMP : 3+3+3+34870 avg5334 max3871 min128k writes844slog plot

Entire screenlog.

Revisiting the RocketRaid 2522

The amug.org site has a review of the 2522 controller. They state average writing at 699 MB/s (5600 Mbit/s) with a 4-PM 16-disk setup. Their 2-PM 10-disk setup does 427 MB/s (3420 Mbit/s). Yesterdays 4-PM 11-disk RAID0 was 3500..4966 Mbit/s (438..620 MB/s). Yesterdays 4-PM 11-disk 4-HW-RAID0 with software /dev/md0 was 4500..5000 Mbit/s (562..625 MB/s).

A RocketRaid 2522 setup with 2 PM's at each of the 2 miniSAS connectors and 3+3+3+3 disks configured into 4 HW-RAID0's: very fluctuating 3470..5180 Mbit/s rates when using 1 wr-nexgen that writes blocks across the 4 RAIDs. With 4 wr-nexgen writing independently to the 4 RAIDs, each of them runs at a quite stable 1540 Mbit/s (6160 Mbit/s aggregate).

A RocketRaid 2522 setup with 4 PMs behind a single miniSAS connector and 3+3+3+3 disks configured into 4 HW-RAID0's: not any noticeable difference compared to the dual-miniSAS case. The four writers run at 1530 Mbit/s each. See the corresponding 06aug2008 log and combined plot. About CPU load: three cores report 50%sy 30%wa 0%hi 0%si. The fourth reports 25%sy 2%wa 25%hi 45%si. Eight 8 'pdflush' instances are running.

The single-miniSAS and dual-miniSAS rates are essentially identical. Clearly the processor or PCI-X buses on the RocketRaid are not the bottleneck. Something else limits the single-writer performance to about 2 Gbps below the expected 4 * 1530Mbps = 6.1 Gbps. The nForce680a on-board 12 SATA's achieved around 5.5 Gbps earlier.

Assembling the 4 hardware RAID0s behind the single miniSAS into /dev/md0: the initial write rate is high but quickly drops to 3550 Mbit/s. Interestingly there is only 1 'pdflush' instance. The rate peaks occasionally for several 10s at 5300 Mbit/s and in this case 'pdflush' consumes 100% CPU (6%us 54%sy 14%wa 26%si). When 'pdflush' drops idle, the rate drops to 3550 Mbit/s.2

There is a good paper Extreme Linux Performance Monitoring Part II. There are statistics for the 4 x hardware RAID0 in 1 software RAID configuration (the run at the end of 06aug2008 log). Combined statistics of vmstat and other output for the single wr-nexgen test.

12 August 2008

Some tests on Abidal AMD2212 with CPU frequency, Hyper-Transport frequency and CPU core# ('nosmp' or disabling individual cores via /sys/devices/). To see if it makes sense to use POSIX AIO. 2 PMPs 5+5 disks, 3 PMPs 5+5+2 disks. Rate shown is the average rate after 140 seconds.

The screenlog is summarized in the 12aug2008_summary.txt.

1GHznormalHTDDR2nosmp2MB
bgchdiejfk
xfs265237613780n/a 3566n/a
raw315638093794380235573802
bcdefghijk
xfs26183776
raw29803807
bgchldiejfkm
xfsn/a
raw4530

1GHz: CPU clock reduced to half; HT: 1 GHz CPU-CPU 600, Mhz CPU-SB1/SB2; DDR2: 667 MHz upped to 800 MHz; nosmp: no multiprocessor, just 1 core for OS+apps; 2MB: 2MByte write size

9 September 2008

Useful commands:

$ watch -n2 iostat -d -m /dev/md0 /dev/sd{a,b,c,d} 1 2 
$ vmstat 5

11 September 2008

Short summary of the three motherboards tested last days, the models were: Asus Striker II Extreme, Asus Rampage Extreme and P5Q PRO.

First mobo to fall to our hands was the P5Q Pro and the main purpose of this board was to test the functionality of its new ICH10R SB chip. Even the specification does not claim clear support for PMP, we still gave a chance for Intel ;). As the flag R shows, the board has PMP support, BUT is non- FIS-based, so it does not help at all. The performance of a single disk (114 MB/s) is superior than 4 (104 MB/s).

Ups, I forgot to mention, we got new disks. Samsung F1 1 TB, and the benchmarks said, the performance is higher than the previous 750GB model. Returning to the P35Q board, 6 disks gives a total output rate of 425 MB/s. Tweaking a bit we can get 540 MB/s but not continuously...(Disabling #NCQD-> see file disktune.sh). Other tests as 10Gbps board + Silicon PCI-E gave very good results. So it's cheap version which give good results for a 4Gbps recorder system, but with limitations.

Next to test was the Striker II. We had few memory problems with DDR3 modules in both mobo's, after struggling for a while we realized that the memtest version was too old for the new RAM modules. So when we upgraded memtest to the latest version the problems disappear.Single disks 112 MB/s (900 MBps), 6 disks 451 MB/s (4800 MBps/4675 MBps)

10 disks - 6+4 - 618 MB/sec 6991/6550 Mbps

9 disks - 3+3+3 - 630 MB/sec 5160 Mbps

9 disks - 3+2+2+2 - 618 MB/sec 5150 Mbps

10 disks - 3+3+2+2 - 666 MB/sec 5270 Mbps

12 disks - 3+3+3+3 - 681 MB/sec 5519 Mbps

Chart summary

L1N 64P5Q ProStriker IIRampage
single disk114 MB/s114 MB/s114 MB/s
six disks425 MB/s451 MB/s
SATA sustained recording rates vs amount of disks

Sustained recording rate in RAID disks for different disk controllers.

07 July 2009

Did some tests with newish kernel versions like Ubuntu Karmic 2.6.30. In addition as these new kernels now include support for the ext4 file system, I benchmarked that one as well. The kernels were from http://kernel.ubuntu.com/~kernel-ppa/mainline/ and installed on the Abidal 2xAMD2212 computer still running Ubuntu Intrepid.

The tests used 20 x 1TB disk RAID0 (4G-EXPReS) connected to the Addonics 4*eSATA controller. Because current ext4 tools are limited to a ≤16TB file system size, the 20TB were GPT-partitioned to contain only a 14TB partition. The power draw of the diskpack has increased over time, due to whatever reason, and now the +5V supply to the PMP boards inside the diskpack is only around 4.8V when idle and 4.65V under load. It seems there were some Block I/O outages and writing 'pauses' in the logs due to this.

The ext4 was formatted with supposedly optimal settings for 20-disk 1024kB-chunksize systems: -b 4096 -E stride=256,stripe-width=$((20*256)) -i 131072

The resulting files can be found under matlab_logplot.

Raw 2.6.27 over 5.0Gbps Raw 2.6.30 4.5Gbps Raw 2.6.31 4.3Gbps
XFS 2.6.27 5.0Gbps XFS 2.6.30 3.5Gbps XFS 2.6.31 not fully tested, seems 3.6Gbps
Ext4 2.6.27 not supported Ext4 2.6.30 3.2Gbps Ext4 2.6.31 3.2Gbps

EXPReS Logo EXPReS Logo This work has received financial support under the EU FP6 Integrated Infrastructure Initiative contract number #026642, EXPReS.