[Xd1-kernel] RE: RapidArray programming interface

Ron Westfall westfall at cray.com
Mon Jun 20 17:03:17 CDT 2005


Hi Troy

I would be surprised if you found any relationship between RapidArray and
OpenIB.  If OpenIB existed at the time we developed RapidArray, it was
probably just emerging.  Certainly the Gen2 stack did not exist at the time.
I suppose it is a possibility that they are recreating some of the ideas we
developed for RapidArray.

With respect to RDMA memory registration ... The first thing to understand
is that we turn what is usually a problem of physical memory management into
a standard virtual memory management problem.  We use the Opteron's GART
subsystem to give the RapidArray communication processor a pointer to
virtual memory rather than physical memory.  The GART maps the contiguous
virtual memory addresses of the RDMA buffer used by the communications
processor into disjointed physical pages of memory.  The mapping is very
similar, if not identical, to the virtual-physical mapping seen by the MPI
application.  The GART capability is typically used by games and other
graphics software to efficiently pass output data to a graphics card for
rendering without having the game designer having to worry about
virtual-physical mapping.

The following info is being passed verbatim from the developers, so I may
not have it quite right ... Having arranged for the communication processor
to work with virtual memory, the same as the MPI application, the RapidArray
code "uses the standard page reference counting mechanism in the Linux
kernel MMU code" to keep the kernel from moving or swapping physical memory.
Hopefully I have conveyed this last bit sufficiently faithfully to let you
find what you need.

Ron

Ron Westfall
Cray Canada Inc.

Email: westfall at cray.com
Phone: 604-484-2249
FAX: 604-484-2221

****************************************************************************
********

This email message is confidential and may contain privileged information.
Any unauthorized dissemination or copying is strictly prohibited.  If you
have received it in error, please delete it and notify us immediately.  This
message was not encrypted and internet email may not be secure. Please
inform us if you require encryption for future messages.

****************************************************************************
********



-----Original Message-----
From: Troy Benjegerdes [mailto:troy at scl.ameslab.gov] 
Sent: Monday, June 20, 2005 1:36 PM
To: Ron Westfall
Cc: 'Ricky A. Kendall'; 'Mike Allen'; 'Ed Wahl'; fchism at cray.com; 'Ted
Packwood'; xd1-kernel at scl.ameslab.gov; 'Auji Atwal'
Subject: Re: RapidArray programming interface



>As you probably know by now, you are getting all the source code for 
>RapidArray except for the VHDL logic in the RapidArray communications 
>processor (aka Hobit or Hobbit).  Just in case you have not found it 
>all yet, the RapidArray code appears in the following source RPMs 
>working from the top down:
>
>mpich-1.2.6-20.src.rpm
>mrai-1.2-24.src.rpm
>kernel-2.6.5_H_01_02-25.src.rpm
>
>The mpich RPM contains an rai (RapidArray interface) MPI device in the 
>mpid/rai directory.  The rai MPI device works with the rapl (rap = 
>RapidArray processor) user level library in the mrai RPM.  The rapl 
>library sends large messages via RDMA using the rapk driver in the 
>kernel.  The rapl library interfaces directly with the RapidArray 
>communications processor
>(hobit) for short messages.  The rapk driver in turn works with the hobit
>driver in the kernel.  Both the rapk and hobit drivers are located in the
>kernel RPM in the xd1 drivers directory.
>
>The rapl library is the closest the XD1 gets to having a low level 
>RapidArray API.  While formally documenting this library's interface 
>has not been given a high priority, there is some information in the 
>/usr/local/include/rapl.h header file.
>
>It should be noted that a subset of the rapl API was also added later 
>to the rapk driver to support RapidArray users (e.g. Lustre) inside the 
>kernel.
>  
>
Some of this (and some of the code) seems very similiar to the OpenIB 
Gen2 infiniband stack.

One thing I'm quite curious about (and has been a continuing issue with 
InfiniBand in general) is RDMA memory registration. How does RAI deal 
with virtual->physical mappings, and locking that memory so the hobbit 
has a valid physical address for the virtual address the application 
process registered?





More information about the Xd1-kernel mailing list