From owner-nwchem-users Sun Sep 30 21:08:06 2001 Received: (from majordom@localhost) by odyssey.emsl.pnl.gov (8.8.8+Sun/8.8.5) id UAA21866 for nwchem-users-outgoing; Sun, 30 Sep 2001 20:52:43 -0700 (PDT) From: Drchemp@aol.com Date: Sun, 30 Sep 2001 23:52:34 -0400 (EDT) Subject: Message Passing in Linux 2.4.2 Kernel To: nwchem-users@emsl.pnl.gov Message-id: <15b.1ca356a.28e94282@aol.com> MIME-version: 1.0 X-Mailer: AOL 4.0 for Windows 95 sub 113 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit Sender: owner-nwchem-users@emsl.pnl.gov Precedence: bulk Following an upgrade to 2.4.2 kernel in RH Linux 7.1 to eliminate NaNs, the ratio of CPU time to Wall clock is grossly in favor of the wall clock. This reminded me of the TCP Performance Fix for Short Messages provided as a patch to ipv4 routines from Josip Loncaric at www.icase.edu which worked very favorably applied to the 2.2 kernel (which suffers the NaN's). Now, however, according to Josip the 2.4 kernel has thoroughly changed the TCP stack, and I've recognized this in the form of my trouble of attempting to 'past' the patched 2.2 kernel related ipv4 routines into a re-compilation of the new 2.4.2 kernel. I'm running NWChem 4.0 recompiled under a running 2.4.2 smp kernel on a pair of dual PIII Xeon 550 MHz boxes with fast ethernet and one GB of RAM per box. That is, four processors. In usr/src/linux-2.4/net there is reference to TUNABLE in RH 7.1 install, apparently relating to some optimization capability. Does anyone have any suggestions on how to return the message passing to better efficiency, assuming I have not overlooked the real cause of the poor ratio of CPU to Wall. Thank you, Michael Oberlander