Portals 4 Reference Implementation ---------------------------------- * About * Building Portals utilizes the Autoconf/Automake/Libtool build system. The standard GNU configure script ane make system is used to build Portals 4. To build: % autoreconf (only once, or when the build system has been updated) % ./configure <options> % make % make check % make install The "make check" step is not strictly necessary, but is a good idea. Options to configure include: --prefix=<DIR> Install implementation in <DIR> * Portals 4 Implementation Notes There is currently one Portals 4 implementation, ib. The ib implementation supports multiple transports, including InfiniBand, UDP (experimental), and SHMEM. - Infiniband: this transport has multiple options, including not actualling using IB: * Infiniband, selected by default, can be disabled with --disable-transport-ib * shared memory, not selected by default, can be enabled with --enable-transport-shmem. It has the option of using KNEM for the large transfer (used if --with-knem=xxxx is present), or to use an internal slower shared memory protocol. Both Infiniband and shmem can be used at the same time. In that case, Portals will use infinband between nodes, and shmem intra-node. If only shmem is used, then portals will not run between nodes (obviously). On top of that, 2 different environments can be generated: * the "fat library" which contains all the portals API. Each process has its own progress thread which leads to the use of too many cores on a machine. This is the default. * the Portals Progres Engine (PPE) and the light library. The PPE is a daemon that regroups the progress threads (one by default) for all the portals process on the host. The portals application links to the "light library" which is only a pass through between the application and the PPE. Very litlle processing is done in the light library; most of it is done inside the PPE. This environment is selected with --enable-ib-ppe. The XPMEM driver must be present, and so it is mutually exclusive with the shmem transport. The daemon must be started prior to running a Portals4 application: <PATH>/p4ppe p4ppe has a set of options. Run with the --help optins to see them. Both the fat and light library are called the same, thus only one type can exist on a machine at the same time; use LD_LIBRARY_PATH and/or LD_PRELOAD to work around. Pre-requisites * libev-4.0.4 (or similar), available at http://software.schmorp.de/pkg/libev.html * The KNEM driver from http://runtime.bordeaux.inria.fr/knem/. Will be used for shared memory if --with-knem=<path> is set. Ensure that /dev/knem is readable/writable by the user running the portals software. * The XPMEM driver from https://code.google.com/p/xpmem/. Mandatory if the PPE is selected, unused otherwise. The current driver has crashing issues and doesn't compile on recent kernel versions. * The ummunotify driver from http://support.systemfabricworks.com/downloads/ummunotify/ummunotify-v2.tar.bz2 (required for correctness with the IB transport). Ensure that /dev/ummunotify is readable/writable by the user running the portals software. WARNING: Not using ummunotify may result in program incorrectness.. Build: Note that the paths to mpi may not be necessary or may need to be changed depending on your linux distribution. * minimal, no shared memory: ./configure * optimized, with shared memory and KNEM support: ./configure --enable-transport-shmem --with-knem=/opt/knem --enable-fast * PPE, no IB transport (so local node only): ./configure --enable-ib-ppe --disable-transport-ib Then type "make". Test: * Type "make check" in the main directory. This will run a series of checks on the local node, with either 1 or 2 ranks. Configure will attempt to figure out the proper way to launch parallel processes. The TEST_RUNNER environment variable can be used to set the proper test runner (such as TEST_RUNNER='yod -np $(NPROCS)'). Environment variables: * PTL_ENABLE_MEM=[0|1] will deactivate/activate the local memory transport, if one is compiled in. * PTL_DEBUG=1 will activate the tracing * PTL_LOG_LEVEL=[0|1|2|3] will set the trace level. * PTL_IFACE_NAME allows for the explicit naming of the network interface for example ib0, eno1, etc. * PTL_DISABLE_MEM_REG_CACHE=[0|1] deactivates/activates the IB memory registration cache. Disabling it no longer requires ummunotify, and the implementation does not keep a registered memory cache. For instance: PTL_LOG_LEVEL=3 PTL_DEBUG=1 yod -n 1 ./spam Building an RPM: * make dist --> this will create portals-4.0.tar.gz * rpmbuild -ta portals-4.0.tar.gz Limitations: * A program may not open more than one NID in its lifetime. Why Portals may not run properly: * The user cannot pin enough memory. By default linux systems allow about 32KB. Check with "ulimit -l". This is for IB only. Edit /etc/security/limits.co onf and change/add the following 2 lines (including the stars): * hard memlock unlimited * soft memlock unlimited (This setting allow every user to pin large amounts of memory. Replace the stars with a user name for a tighter setting). * IB is not up and running (check with "ibv_devinfo") or the default IPoIB interface, ib0, doesn't have an IP address. All interfaces should be on the same IP range. * slow performances can be caused by cpuspeed. Disable it ("service cpuspeed stop"). * uneven performance between repeats can be caused by the threads being on different cores. Pin them. * When shmem is used, under the wrong circumstances, the files /dev/shm/portals4-shmem-* are not removed. If a subsequent run tries to re-use the same name, it will fail and the application may exit.