I asked last week about how come FTP's to localhost on Solaris2.9 machines were noticeably slower than on Solaris2.8 machines with similar hardware/configuration. This was actually part of a bigger problem of slow/erratic throughput over Gigabit links during testing. I actually did not hear back from anyone ... but I figured out the problem myself so here's my summary back to the list - in a nutshell, the Solaris2.9 in.ftpd appears to be much slower than the 2.8 version - possibly because using wu-ftpd stuff now (?) I also came across a what I think is a minor bug in the ce drivers; will send a seperate Email about that. I'll see if someone here can formally submit both of these to Sun as a bug report, but I can't promise that will happen, so if someone else on the list wants to submit to Sun, that would be great since these should be fixed. Thanx, alek http:/www.komar.org/ [Report below is sanitized ... I'll use beer names for the 2.9 machines] Subject: GigE is fine ... it's the *&^%$#@! Solaris2.9 in.ftpd ... OK ... after **ALL* that work, I'm fairly certain the problem is with the Solaris2.9 in.ftpd and the data clearly supports this. Recall that (for same hardware), 2.9->2.9 GigE ftp *PUTS* were very erractic in performance (often slower than 100baseT). Also, recall that ftp's to locahost were noticeably slower on 2.9 than 2.8. And the summary of what I "learned" from spending time with Dave on Friday is that the network gear looked clean, but the netstat -k output seemed to indicate a window size reduction going on. And rcp data seemed pretty decent all around. So I search SunSolve (for the bazillion'th) time and start looking for stuff like "in.ftpd", "tcp", and other stuff (since I was thinking it's not the network driver itself, but something in the network stack). Basically zippo that seems to relate to this problem. But I keep thinking about that darn difference between FTP's to localhost; and finally hit on the idea of trying the Solaris2.8 ftp & in.ftpd on a Solaris2.9 host - BINGO!!!!! ftp (the client side program) made no difference ... but when I put the Solaris2.8 /usr/sbin/in.ftpd in place on the 2.9 hops machine - BAMM; everything ROCKED!!! Attached is some results that clearly indicates that 2.9 in.ftpd appears to be the culprit ... and only for ftp PUTS ... GETS are mostly OK. Since we rarely use ftp in production use, I don't see a need to patch our systems (eventually Sun will figure this out and we'll pick up on the next recommended patch set) so I'd say based on this, we are SOLID with GigE. I realize we spent a bit of time on this, but we did learn some other "good/applicable" things such as the speed/duplex autoneg, pause stuff, and infinite_burst needing to be set to 1 to get on both interfaces (I'm convinced this is a Solaris bug). I also like "proving" a "good" benchmark to test/debug other GigE implementations as I'm sure we'll have more of that in the future. alek First, lets review the machines/setup with GigE connections: HOST Private IP SUN-2.8 192.168.1.22 HP-11.11 192.168.1.20 barley 192.168.1.23 hops 192.168.1.24 malt 192.168.1.25 h2o NONE The PRIVATE network is a Copper GigE SWITCH with some other machines connected not shown here. Probably "some" traffic on this from other machines, but probably little impact. HP-11.11 is a "big" HP running HP-UX11.11 barley, SUN-2.8, hops, & malt are Sun280R's that are fairly "quiet" right now. SUN-2.8 is running Solaris2.8 and has 2 Gbytes of RAM. Barley, hops, & malt are running Solaris2.9 and have 4 GBytes of RAM. All are dual-CPU. As alluded to above, latest GigE patches have been recently applied. Only tweek to the driver (see /kernel/drv/ce.conf) is to turn-off auto-negotiation/all speeds/duplex and force 1000FULL. For my tests, I did the following: - Create 100 MByte file by cat'ing netscape executeable - /tmp/alek-100 - Fire up FTP from source to target machine - use binary mode - Started from /tmp on source, do a PUT/GET to /dev/null - i.e.: $ cd /tmp $ ftp TARGET (login) ftp> binary (ensure binary mode) PUT ftp> lcd /tmp PUT ftp> cd /dev PUT ftp> put alek-100 null GET ftp> lcd /dev GET ftp> cd /tmp GET ftp> get alek-100 null - Repeat MULTIPLE TIMES (>5) looking for repeatability and also that 100 MByte file is truly RAM resident. I'm pretty certain that FTP itself does not do any caching (whereas NFS does). I threw out the slowest and fastest times as outlyers. Since /dev/null is a "bit bucket", this should be reasonable test of fastest possible performance, since no writes are required on the target side, and (assuming the file is RAM resident on the source), the limits are the ability of the network/wire and machine/IP stacks. Here is a table showing the results I got. The "TARGET" column shows what the FTP connection went *TO* (or where the file was PUT), with the source machine being where the FTP originated from. The numbers show throughput (as reported by FTP) in MBytes/second. Absolute theoretical max would be on the order of 100 MBytes/second. Note that this implies almost a *ONE* second transfer time for the 100 MByte file, so for further testing, I could use larger files if need be - fast machines/networks requires BIG BIG files!!!! Barley: "stock" in.ftpd from build CD's (I think 2002/12?) Hops: Solaris2.8 in.ftpd Malt: Solaris2.9 in.ftpd patch 114564-01 (security related) SOURCE MACHINE SUN-2.8 HP-11.11 Barley Hops Malt TARGET PUT-LOOPBACK 133-135*2 331-336 67-77 146-148*2 67-76 GET-LOOPBACK 122-123*2 360-364 147-150*1 132-133 142-147*1 PUT-PRIVATE-SUN-2.8 N/A 67-68 56-57 57-58 57-59 GET-PRIVATE-SUN-2.8 N/A 59-60 57-58 57-58 56-58 PUT-PRIVATE-HP-11.11 62-63 N/A 70-70 70-70 70-70 GET-PRIVATE-HP-11.11 71-72 N/A 52-58 54-57 53-57 PUT-PRIVATE-barley 52-56 51-53 N/A 4-52*3 7-54*3 GET-PRIVATE-barley 59-60 67-68 N/A 63-67*1 65-67*1 PUT-PRIVATE-hops 44-47*4 68-69 51-56*2 N/A 63-66*2 GET-PRIVATE-hops 59-60 68-68 61-66*2 N/A 65-66*2 PUT-PRIVATE-malt 58-61 52-54*4 8-52*3 6-56*3 N/A GET-PRIVATE-malt 59-60 67-68 65-67*1 65-67*1 N/A *1: Note that GETS are noticeably quicker than PUTS on 2.9 in.ftpd *2: Solaris 2.8 in.ftpd is noticeably quicker for PUTS than 2.9! *3: Example of Solaris2.9 in.ftpd PUTS showing erratic PUT performance. *4: These data points aren't terrible, but a little odd ... Note with with the Solaris2.8 in.ftpd, performance is rock-solid consistant/good/fast on the PRIVATE network. _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Mon May 12 13:59:55 2003
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:10 EST