From owner-nwchem-users@emsl.pnl.gov Sun Feb 3 03:26:43 2008 Received: from odyssey.emsl.pnl.gov (localhost [127.0.0.1]) by odyssey.emsl.pnl.gov (8.14.1/8.14.1) with ESMTP id m13BQgWV020683 for ; Sun, 3 Feb 2008 03:26:43 -0800 (PST) Received: (from majordom@localhost) by odyssey.emsl.pnl.gov (8.14.1/8.14.1/Submit) id m13BQgGL020682 for nwchem-users-outgoing-0915; Sun, 3 Feb 2008 03:26:42 -0800 (PST) X-Authentication-Warning: odyssey.emsl.pnl.gov: majordom set sender to owner-nwchem-users@emsl.pnl.gov using -f X-Ironport-SG: OK_Domains X-Ironport-SBRS: 3.5 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAEs0pUeTW9Efh2dsb2JhbACQLQEBAQgKKZtDAYEB X-IronPort-AV: E=Sophos;i="4.25,297,1199692800"; d="scan'208";a="65382020" Message-ID: <001101c86657$9a805a30$8ecd5b93@hmatovic> From: "zoran matovic" To: =?iso-8859-15?Q?J=F6rg_Sa=DFmannshausen?= Cc: References: <000801c865a9$72f66d50$8ecd5b93@hmatovic> <200802021628.38758.sassmannshausen@tugraz.at> Subject: Re: [NWCHEM] Problems with parallel running Date: Sun, 3 Feb 2008 12:26:21 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-15"; reply-type=original Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.3138 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198 Sender: owner-nwchem-users@emsl.pnl.gov Precedence: bulk Dear Jörg, Thanks for answer. First I'm not so much familiar with linux and almost all the settings have been done by the colleague from university center who unfortunately (as usual) doesn't have free time. So I started to learn to be able to fix up the things. I am using PBS torgue (doesn't know is it openpbs or not it is just installed from univ centre). It is likely that I understand how to run nwchem from command line (didn't try yet) as I operate in similar way when pcgamess is in quest. However, I wish to use ecce as it is appropriate way to run ecce and analyze results file. How'd you view and analyze nwchem out file if not using ecce? So You think that my pbs torque is enough good for nwchem5.1. is it true? if yes which download form of nwchem5.1 to install? if not what to do to upgrade? thanks for answer once again Have a nice day Zoran ----- Original Message ----- From: "Jörg Saßmannshausen" To: Cc: Sent: Saturday, February 02, 2008 4:28 PM Subject: Re: [NWCHEM] Problems with parallel running > Dear Zoran, > > I am not using ecce but NWChem. > How did you submit the job to the queue? > If you want to use one node with 4 cpus, with PBS-type schedulers you > would > use: > qsub -l nodes=1:ppn=4 ... > where nodes is number of nodes (physical enclosures) and ppn is number of > workers (i.e. CPUs). > > So if you want to use all 16 CPUs it would be: > > qsub -l nodes=4:ppn=4 ... > > As for MPICH: I would recommend MPICH2 or OpenMPI. MPICH2 is using SHM for > processes within a box/node and not the TCP stack (as far as I understand > it). OpenMPI has got the option 'processor affinity' which, at least from > what I gathered from other mailing lists, can improve the speed > significantly. However, als always, your milage might vary. Furthermore, > for > OpenMPI the mpiexec is adapted to the PBS type scheduler, whereas for > MPICH2 > you need to install a seperate mpiexe. The mpiexec from MPICH/MPICH2 does > not > work with the PBS scheduler in a straightforward way! Unfortunately, there > are many programs around called mpiexec, which adds to the confusion. > > Let me know if you got any more problems with the PBS, at least here I can > help ;-) > > By the way, have you got OpenPBS or really PBS? If you got the former, I > would > recomment upgrading to Torque as OpenPBS is no longer supported. > > As for MPICH: you can have as different mpich versions floating around as > you > like, you only have to make sure that hey are in different locations. You > might work with LD_LIBRARY_PATH at one point so you get the right > libraries, > but that is a minor issue. > > Best wishes from Graz > > Jörg > > > > Am Samstag 02 Februar 2008 15:39 schrieb zoran matovic: >> Dear NWCHEM users, >> >> I installed last year NWCHEM 5.0 and ecce4.02 (as GUI) on our 16 CPU >> beowulf cluster (4 Intel core 2 quad machines;Scientific linux 4.5; PBS >> batch system; MPICH) As at the beginning the same thing happens to me: >> when >> wish to work with more than 4 cpu the program starts either with only one >> machine (4 cpu) or with one machine executing the same number of >> processes >> as a number of requested processors. Now I installed ecce 4.5.1 but >> unfortunately installed it twice in two different dirs so when change >> adf_set_env.sh path to the new path and wish to start ecce it gaves >> warrning that port:8080 is already occupied. So I need help for two >> problems: one how to force parallel mode on all 4 units (16 cpus) and how >> to fix the situations with ecce. And third, wish to ask you what version >> of >> new NWCHEM 4.5.1 I should use to run nwchem in parallel on given cluster >> (note that I use MPICH but bot MPICH2), and if I must install MPICH2 how >> difficult it is and should it interfere other parallel programs that work >> with MPICH? >> >> HELP PLEASE. >> >> Thanks in advance >> >> Zoran >> >> >> >> web-site http://www.pmf.kg.ac.yu/~zmatovic/index.htm >> >> Dr Zoran D. Matovic >> professor of BioInorganic chemistry >> Department of Chemistry >> Faculty of Science >> University of Kragujevac >> 34000 Kragujevac >> Serbia >> >> Tel/Fax + 381 34 336 223 / 381 34 335 040 > -- > ************************************************************* > Jörg Saßmannshausen > Institut für chemische Technologie organischer Stoffe > TU-Graz > Stremayrgasse 16 > 8010 Graz > Austria > > phone: +43 (0)316 873 8954 > fax: +43 (0)316 873 4959 > homepage: http://sassy.formativ.net/ > > Please avoid sending me Word or PowerPoint attachments. > See http://www.gnu.org/philosophy/no-word-attachments.html