As there were many various interesting answers, I will quote all of them. Basically, I could expect near 100 Mbytes/s with Jumbo Frames under SunFire (This does not take the disk backend into account, just memory-to-memory transfer speaking...). Thank you all. -------------------------------------------------------------- Joe Fletcher's answers: On a V880 8x900 we did some basic tests using ftp which gave us about 45MB/s. This put about a 10-15% overhead on the machine (ie it takes about a whole UltraIII cpu to drive the card in any serious sense). This is dumping data from an FC array down separate HBAs to another array volume. ... Just checked some old results from another site I used to run. Probably not very interesting to you but on an Alpha ES40 4xEV6 serving a group of Intel clients we managed to get about 80MB/s. The Alpha was linked into a 3COM switch via gigabit with the clients each on a 100FD port on the same switch. Each client was tranferring a different set of files, some via ftp, some via the SMB server software (ASU). We could get similar results using two Alphas with memory filesystems mounted which allowed us to get the storage out of the picture. Not representative of real world particularly but we just wanted to see how fast it was capable of going. I suspect the file caching helped quite a lot where the PC clients were concerned. -------------------------------------------------------------- Christophe Dupre's answer: What ethernet card do you have in your server ? Sun has at least two chipsets used in gigabit cards: GEM (with interface as ge0) and Cassini (sun gigaswift, ce0 interface). The GEM is older and pretty much all the processing is done by the CPU, and the throughput isn't that great. The Cassini is much better and offload some processing (IP CRC and TCP CRC) to the card, yielding much better throughput. Note that GEM is only 1000BaseSX, while Cassini does both fiber and copper. What do you use to compute the throughput ? I use iperf and between two servers (both ultrasparc -2 400MHz, both dual CPU), both having GEM-based cards connected to a Cisco 4506 switch I get 85Mbit/s for a single connection, and an aggregate of 94Mbit/s with about 40% kernel time according to top. This is using an MTU of 1500 (the GEM and Cisco switch don't do jumbo frames). The TCP Window size was 64KByte. By comparison, iperf runs between a Sun ultraSPARC3 with a Sun gigaswift and a Dell PowerEdge 2650 with a Broadcom 1000TX card connected using the same Cisco catalyst and 48KByte TCP windows yield 480MBit/s. So before upgrading the CPU you should make sure you have a card that offloads the CPU like the gigaswift. Next, jumbo frames don't matter much - support is not standardized, not much equipment supports it, and you can get pretty good performance without. I'm not sure how much the CPU speed is needed, though. I'll install a gigaswift in an ultraSPARC2 soon, I can tell you the performance difference then. -------------------------------------------------------------- Jason Santos's answer: I would suspect that your bottleneck on the E10K would be the SBUS interface, not CPU speed. With a gem or GigaSwift PCI card in a 750MHz 6800, we get about 60MB/s over NFS with a single thread. Raw UDP or TCP throughtput would be much higher, although I never tested it. Let me test now, stand by... This is a quick test from a 4x750MHz 6800 to a 4x1200MHz V880 (no network tuning, single thread): ttcp-t: buflen2768, nbuf 48, align384/0, portP01 tcp -> nbmaster ttcp-t: socket ttcp-t: connect ttcp-t: 1073741824 bytes in 23.59 real seconds = 44441.28 KB/sec +++ ttcp-t: 1073741824 bytes in 23.16 CPU seconds = 45275.30 KB/cpu sec ttcp-t: 32768 I/O calls, msec/call = 0.74, calls/sec = 1388.79 ttcp-t: 0.1user 23.0sys 0:23real 98% 0i+0d 0maxrss 0+0pf 3756+261csw ttcp-t: buffer address 0x74000 The fastest Gigabit transfers I have ever seen were from an IBM x345 (dual Intel Xeon 2.4GHz) over NFS to a NetApp FAS960, I was able to get over 100MB/sec, which is 80% of the theoretical max of 125MB/sec. -------------------------------------------------------------- Paul Theodoropoulos's answer: Sun's 'Rule of Thumb' from the UltraSPARC II era was that you should have 300Mhz of ultraSPARC II horsepower per gigabit adapter. That's 'dedicated' horsepower - if you had one 300Mhz cpu and one gigabit adapter, you'd have no horsepower to spare for your applications. In practice of course, the gigabit gets throttled down and the horsepower shared. But i would expect approximately the same performance requirements with ultrasparc III, frankly. -------------------------------------------------------------- Alex Madden's answer: http://www.sun.com/blueprints/0203/817-1657.pdf <http://www.sun.com/blueprints/0203/817-1657.pdf> -------------------------------------------------------------- JV's answer: #2) Throughput may depend more on the underlying storage architecture's ability to READ. You will get better with Hardware RAID 0/1 than software RAID like Disksuite or VXVM. #3) copper or optical gigE? I use optical, but I just got v240s last month so I am beginning to experiment with their ce interfaces. #4) On optical ge, with 14 column Veritas stripes, on large-ish dbf files (1.5-2GB), 6x336Mhz cpus, I can get 45 MB/sec with 35% sys. I haven't had a chance to tune and test my 10-12 cpu UltraS-II (optical) or 2 cpu UltraS-III v240 (copper ce) boxes. -------------------------------------------------------------- Tim Chipman's answer: You might want to use " ttcp " utility to test tcp bandwidth throughput. It is more likely to represent " best case scenario " throughput that is in-keeping with statements like " gig-ether can do 100Mbytes/sec " :-) we did a bit of testing here a while back, and I'm appending the info below as a general reference, for what use it may be. test boxes were, athlon MP running either Solaris x86 OR linux ultraSparcII running solaris8 note, based on my experience, it seems unlikely you will ever get " real world data xfer " much above 50-55Mbytes/sec over gig-ether. " ttcp " benchmarks are one thing, but real-world protocols are another. NOTE: testing done here using two dual-athlon systems, identies as follows: wulftest = redhat 8 (dual-1800mhz, 1 gig ram, 64-bit PCI) wulf2 = redhat 8 (dual-2000mhz, 1 gig ram, 64-bit PCI) thore = solaris8x86 (dual-2000mhz, 1 gig ram, 64-bit PCI) (note - Wulf2 & Thore are actually the same system with 2 different HDDs to boot the alternate OS'es) ultra5 = 270mhz Ultra5 (nb, 32-bit PCI bandwidth only) Gig-ether NICs being tested are all 64-bit PCI / Cat5 cards: Syskonnect SK-9821 3Com 3C996B-T (BroadCom chipset) (note, we had 2 x SK nics and 1 x 3com on-hand, so didn't test 3com<->3com perfo rmance.) Software being used for testing was (1) TTCP and (2) Netbackup (for info on TTCP, visit the URL: http://www.pcausa.com/Utilities/pcattcp.htm <http://www.pcausa.com/Utilities/pcattcp.htm> ) Parameters tuned include Jumbo Frames (MTU of 1500 vs 9000) ; combinations of NI C<-> NIC and system<->system Connection between NICs was made with a crossover cable, appropriately wired (al l strands) such that Gig-ether was operational. Note these ##'s are NOT " comprehensive ", ie, NOT every combination of tuneable p arameters has been attempted / documented here. Sorry about that. Hopefully, " so mething is better than nothing ". [TTCP results] SysKonnect <-> SysKonnect = 77 MB/s * Wulftest with Syskonnect (Redhat 8) * Thore with Syskonnect (Solaris x86) * Jumbo frames don't affect speed, but offload the systems by around 20-40% for CPU loading. SysKonnect <-> 3COM = 78 MB/s * Wulftest with Syskonnect (Redhat 8) * Wulf2 with 3com (Redhat 8) * MTU = 1500 SysKonnect <-> 3COM = 97 MB/s * Wulftest with Syskonnect (Redhat 8) * Wulf2 with 3com (Redhat 8) * MTU = 9000 ULTRA5 <-> Wulftest tests with TTCP: (SysKonnect <-> Syskonnect NICs) with JumboFrames: * 25% CPU load on Ultra5, 29 MB/s without JumboFrames: * 60% CPU load on Ultra5, 17 MB/s [Netbackup results] Large ASCII file (5 gigs) = 50 MB/s * Wulftest with SysKonnect (Redhat 8) * Thore with 3COM (Solaris x86) * MTU 1500 System backup (OS files, binaries) = 11 MB/s * Wulftest with SysKonnect (Redhat 8) * Thore with 3COM (Solaris x86) * MTU 1500 ORIGINAL QUESTION : Hi, Basic question is : What effective throughput can I expect on a Gigabit Ethernet link with UltraSparc-3 CPU, with or without Jumbo Frame support, with or without multithreaded transfer ? I ask this because with UltraSparc-2 CPU (E10K) and GE link (without Jumbo Frame support) we couldn't get more than : - 15 Mbytes/s with monothreaded transfer - 55 Mbytes/s with multithreaded transfer (the best rate was reached with 10 threads) (We measured application throughput, that is to say TCP throughput). As you see the CPU overhead with 1500 MTU was so high (truss showed 80% kernel), that we had to multithread the transfer to reach the best throughput (55 Mbytes/s). Unfortunately we were far from the theoretical limit (100 Mbytes/s ?), even if there were still CPU resources free (50%), and I can't determine if it was caused by the small MTU, the poor US-2 throughput or both ? I think that Jumbo Frame could increase the throughput and lower the CPU overhead, but how much ? Will the US-3 throughput help much ? Is there any chance to reach the 100 Mbytes/s limit ? Thanks for your feedback, I will summarize. --- Sebastien DAUBIGNE sdaubigne@bordeaux-bersol.sema.slb.com <mailto:sdaubigne@bordeaux-bersol.sema.slb.com> - (+33)5.57.26.56.36 SchlumbergerSema - SGS/DWH/Pessac _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Fri Nov 28 08:30:16 2003
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:24 EST