Symptom
Slow host-coprocessor MPI communications in systems with no InfiniBand* HCA. If running with I_MPI_DEBUG=2 or higher, you will see one of the following messages indicating that the TCP fabric has been selected:
[0] MPI startup(): tcp data transfer mode [0] MPI startup(): shm and tcp data transfer modes
Resolution
Ensure that you are using at least MPSS Version 2.1. Ensure that the OFED stack for MPSS is installed and running. This is used even in systems with no IB HCA for improved communications. Refer to the instructions in Configuring Intel® Xeon Phi™ Coprocessors Inside a Cluster for details.
Fabric selection will default to DAPL* over Symmetric Communications Interface (SCIF), using the provider ofa-v2-scif0. There are two exceptions to this:
- If you are using IP addresses to specify hosts
- If you are using a version of the Intel® MPI Library earlier than Version 4.1 Update 1
If either of these is true, you can either use
export I_MPI_DAPL_PROVIDER=ofa-v2-scif0
or add
-genv I_MPI_DAPL_PROVIDER ofa-v2-scif0
to the mpirun arguments to manually set the provider.