From owner-nwchem-users@emsl.pnl.gov Tue Feb 5 01:04:47 2008 Received: from odyssey.emsl.pnl.gov (localhost [127.0.0.1]) by odyssey.emsl.pnl.gov (8.14.1/8.14.1) with ESMTP id m1594kio003518 for ; Tue, 5 Feb 2008 01:04:47 -0800 (PST) Received: (from majordom@localhost) by odyssey.emsl.pnl.gov (8.14.1/8.14.1/Submit) id m1594krN003517 for nwchem-users-outgoing-0915; Tue, 5 Feb 2008 01:04:46 -0800 (PST) X-Authentication-Warning: odyssey.emsl.pnl.gov: majordom set sender to owner-nwchem-users@emsl.pnl.gov using -f X-Ironport-SG: OK_Domains X-Ironport-SBRS: 5.9 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAE+1p0eBGwLKn2dsb2JhbACBWI5SAQEBAQEGBAYJCBiWLIVsAQ X-IronPort-AV: E=Sophos;i="4.25,306,1199692800"; d="scan'208";a="65603157" X-DKIM: Sendmail DKIM Filter v2.4.2 mailrelay2.tugraz.at m1594PJh000500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tugraz.at; s=mailrelay; t=1202202273; bh=JNK3fBfRp2pyXIWZy1iUkSE8MLuC7rw6bJqis 7V4uwo=; h=From:Organization:To:Subject:Date:User-Agent:References: In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding: Content-Disposition:Message-Id; b=fiMwQL7V7LyvNeKPzGFEAk137u7koxCt S5X8CebLtVemkBFcggnQBuEu9lHULGS0VQstYR8etXAZqlRTCFL+jr4I49gbYnq9Uiv g+Nn/oNzOuICHOfsMd07XLLb8gpgEKpEKU9ay/RCX5FItcK4ek51lmvCHZKj3gLyOrN 1tAYQ= From: =?iso-8859-15?q?J=F6rg_Sa=DFmannshausen?= Organization: TU-Graz, ICTOS To: nwchem-users@emsl.pnl.gov Subject: Re: [NWCHEM] Problems with parallel running Date: Tue, 5 Feb 2008 10:04:25 +0100 User-Agent: KMail/1.9.5 References: <000801c865a9$72f66d50$8ecd5b93@hmatovic> <200802021628.38758.sassmannshausen@tugraz.at> <001101c86657$9a805a30$8ecd5b93@hmatovic> In-Reply-To: <001101c86657$9a805a30$8ecd5b93@hmatovic> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 8bit Content-Disposition: inline Message-Id: <200802051004.25927.sassmannshausen@tugraz.at> X-Spam-Scanner: SpamAssassin 3.002003 X-Spam-Score-relay: -2.6 X-Scanned-By: MIMEDefang 2.63 on 129.27.10.19 Sender: owner-nwchem-users@emsl.pnl.gov Precedence: bulk Dear Zoran, as Dunyou already pointed out, you need to install NWChem from source in order to use _your_ MPI interface. This does explain your problems with the parallel runs. Torque is fine as a scheduler, we are using it as well. As for your MPI, MPICH would be fine, personally I tend to use MPICH2. One or the other needs to be installed on your system. As for installation, if you follow the instructions down to the point you cannot get too much wrong. As you are using an Intel system, I would recommend to get a copy of the Intel MKL library. It could well be that it is already installed on your system, I would check /opt for it or ask your system administrator. Alternativly you could use ATLAS. I would not recommend using the generic BLAS which comes with NWChem as usually it is not optimized and thus your calculations are slow (very slow if you are unlucky). Assuming you are using a 64bit distro, you really need to make the 64_to_32 conversion as mentioned in the installation script and do a: make clean after that step. Unfortunately, this is not explicity mentioned (maybe corrected in the latest version) and that is where I got wrong. You don't need torque for your installation, only for running NWChem on the queueing system. Don't get too frustrated when things are not going smooth at the beginning, that is normal, we all where there at one point of our lifes. If you run into problems, check the archive of the list (some questions turn up periodically), if you cannot find it, ask the list. I hope it helps a bit. Jörg Am Sonntag 03 Februar 2008 12:26 schrieb zoran matovic: > Dear Jörg, > Thanks for answer. > First I'm not so much familiar with linux and almost all the settings have > been done by the colleague from university center who unfortunately (as > usual) doesn't have free time. So I started to learn to be able to fix up > the things. > I am using PBS torgue (doesn't know is it openpbs or not it is just > installed from univ centre). > It is likely that I understand how to run nwchem from command line (didn't > try yet) as I operate in similar way when pcgamess is in quest. > However, I wish to use ecce as it is appropriate way to run ecce and > analyze results file. How'd you view and analyze nwchem out file if not > using ecce? So You think that my pbs torque is enough good for nwchem5.1. > is it true? if yes which download form of nwchem5.1 to install? if not what > to do to upgrade? > > thanks for answer once again > > Have a nice day > Zoran > > > ----- Original Message ----- > From: "Jörg Saßmannshausen" > To: > Cc: > Sent: Saturday, February 02, 2008 4:28 PM > Subject: Re: [NWCHEM] Problems with parallel running > > > Dear Zoran, > > > > I am not using ecce but NWChem. > > How did you submit the job to the queue? > > If you want to use one node with 4 cpus, with PBS-type schedulers you > > would > > use: > > qsub -l nodes=1:ppn=4 ... > > where nodes is number of nodes (physical enclosures) and ppn is number of > > workers (i.e. CPUs). > > > > So if you want to use all 16 CPUs it would be: > > > > qsub -l nodes=4:ppn=4 ... > > > > As for MPICH: I would recommend MPICH2 or OpenMPI. MPICH2 is using SHM > > for processes within a box/node and not the TCP stack (as far as I > > understand it). OpenMPI has got the option 'processor affinity' which, at > > least from what I gathered from other mailing lists, can improve the > > speed > > significantly. However, als always, your milage might vary. Furthermore, > > for > > OpenMPI the mpiexec is adapted to the PBS type scheduler, whereas for > > MPICH2 > > you need to install a seperate mpiexe. The mpiexec from MPICH/MPICH2 does > > not > > work with the PBS scheduler in a straightforward way! Unfortunately, > > there are many programs around called mpiexec, which adds to the > > confusion. > > > > Let me know if you got any more problems with the PBS, at least here I > > can help ;-) > > > > By the way, have you got OpenPBS or really PBS? If you got the former, I > > would > > recomment upgrading to Torque as OpenPBS is no longer supported. > > > > As for MPICH: you can have as different mpich versions floating around as > > you > > like, you only have to make sure that hey are in different locations. You > > might work with LD_LIBRARY_PATH at one point so you get the right > > libraries, > > but that is a minor issue. > > > > Best wishes from Graz > > > > Jörg > > > > Am Samstag 02 Februar 2008 15:39 schrieb zoran matovic: > >> Dear NWCHEM users, > >> > >> I installed last year NWCHEM 5.0 and ecce4.02 (as GUI) on our 16 CPU > >> beowulf cluster (4 Intel core 2 quad machines;Scientific linux 4.5; PBS > >> batch system; MPICH) As at the beginning the same thing happens to me: > >> when > >> wish to work with more than 4 cpu the program starts either with only > >> one machine (4 cpu) or with one machine executing the same number of > >> processes > >> as a number of requested processors. Now I installed ecce 4.5.1 but > >> unfortunately installed it twice in two different dirs so when change > >> adf_set_env.sh path to the new path and wish to start ecce it gaves > >> warrning that port:8080 is already occupied. So I need help for two > >> problems: one how to force parallel mode on all 4 units (16 cpus) and > >> how to fix the situations with ecce. And third, wish to ask you what > >> version of > >> new NWCHEM 4.5.1 I should use to run nwchem in parallel on given cluster > >> (note that I use MPICH but bot MPICH2), and if I must install MPICH2 how > >> difficult it is and should it interfere other parallel programs that > >> work with MPICH? > >> > >> HELP PLEASE. > >> > >> Thanks in advance > >> > >> Zoran > >> > >> > >> > >> web-site http://www.pmf.kg.ac.yu/~zmatovic/index.htm > >> > >> Dr Zoran D. Matovic > >> professor of BioInorganic chemistry > >> Department of Chemistry > >> Faculty of Science > >> University of Kragujevac > >> 34000 Kragujevac > >> Serbia > >> > >> Tel/Fax + 381 34 336 223 / 381 34 335 040 > > > > -- > > ************************************************************* > > Jörg Saßmannshausen > > Institut für chemische Technologie organischer Stoffe > > TU-Graz > > Stremayrgasse 16 > > 8010 Graz > > Austria > > > > phone: +43 (0)316 873 8954 > > fax: +43 (0)316 873 4959 > > homepage: http://sassy.formativ.net/ > > > > Please avoid sending me Word or PowerPoint attachments. > > See http://www.gnu.org/philosophy/no-word-attachments.html -- ************************************************************* Jörg Saßmannshausen Institut für chemische Technologie organischer Stoffe TU-Graz Stremayrgasse 16 8010 Graz Austria phone: +43 (0)316 873 8954 fax: +43 (0)316 873 4959 homepage: http://sassy.formativ.net/ Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html