Parallelization

Force Field X commands are launched using the following syntax:

ffxc [-D<property=value>] <command> [-options] <PDB|XYZ>

Parallelization of the computational work is accomplished using three approaches:

Shared memory parallelization across CPU cores (i.e. multiple threads).
MPI message passing parallelization across compute nodes.
Offload to GPUs via OpenMM.

In many cases, more than one parallelization strategy can be combined. Below each will be discussed separately and in combination.

Shared Memory Parallelization

By default, each FFX process will use all available threads for calculation of the system energy and forces. To explicitly set the number of threads, the syntax is:

ffxc -Dpj.nt=X [-D<property=value>] <command> [-options] <PDB|XYZ>

where X is the number of threads (e.g. 1, 2, etc). Any command that evaluates a system energy and/or forces will usually benefit. The abbreviation pj.nt stands for Parallel Java Number of Threads.

MPI Message Passing Parallelization

By default, FFX executes only a single process. However, for algorithms such as Many-Body rotamer optimization and Thermodynamics, multiple processes can cooperate. To enable this, a "Scheduler" is needed to launch the processes and organize communication between processes. The command to start the Scheduler is:

ffxc Scheduler -p X > scheduler.log & sleep 30s

where X is the number of threads per process (e.g. 1, 2, etc). The number of cores on each node must be divisible by X (e.g. if there are 40 cores, X can be 1, 2, 4, 5, 8, 10 or 20). The "sleep" command allows the Scheduler time to start up before the true FFX command is executed:

ffxc -Dpj.nt=X -Dpj.nn=Y [-D<property=value>] <command> [-options] <PDB|XYZ>

where X must be equivalent between the Scheduler and subsequent FFX command. The flag pj.nn stands for Parallel Java Number of Nodes. The number of requested nodes Y must be equal to or less than the of nodes controlled by the scheduler.

For jobs launched by Sun Grid Engine (SGE), the Scheduler will read in the file pointed to by the environment variable PE_HOSTFILE to determine the IP addresses of nodes where jobs can be scheduled. For example, if the PE_HOSTFILE indicates that SGE has allocated two 40 cores nodes for the job, and the number of threads per process was set to 10, then the FFX job could request anywhere between 1 and 8 processes across two nodes.


          Usage: ffxc Scheduler [-hv] [-e=PE_HOSTFILE] [-m=2G] [-p=all] [-P=20617] [-W=8080]
          The Scheduler runs parallel jobs over nodes.
          -e, --hostfile=PE_HOSTFILE Environment variable that points to the host file.
          -h, --help Print this help message.
          -m, --memory=2G String value of -Xmx to pass to worker nodes.
          -p, --pj.nt=all The number of processor cores (threads) per process.
          -P, --port=20617 Set the port the front end server will listen on.
          -v, --verbose Turn on verbose back end logging.
          -W, --webport=8080 Set the port the server will serve a webpage to.

GPU Parallelization

FFX uses the OpenMM API to accelerate energy evaluations, optimization algorithms, dynamics simulations and calculation of free energy differences. Currently, OpenMM 7.3.1 is included with FFX and depends on CUDA version 9.2. To use OpenMM acceleration on a single node (i.e. in the absence of MPI message passing across multipole nodes) example commands include:

ffxc [-D<property=value>] MinimizeOpenMM [-options] <PDB|XYZ>

ffxc [-D<property=value>] DynamicsOpenMM [-options] <PDB|XYZ>

FFX can also make use of multiple GPUs in combination with message passing for Many-Body rotamer optimization and for Thermodynamics (i.e. the Monte Carlo OST algorithm). For Many-Body rotamer optimization across nodes, after starting the Scheduler as described above, the command is

ffxc -Dpj.nn=Y -Dplatform=omm [-D<property=value>] ManyBody [-options] <PDB|XYZ>

If there are multiple GPUs per physical node, an additional flag is needed:

ffxc -Dpj.nn=Y -Dplatform=omm -DnumCudaDevices=Z [-D<property=value>] ManyBody [-options] <PDB|XYZ>

where Z is the number of nVidia GPUs. Finally, the command for GPU accelerated Thermodynamics is:

ffxc -Dpj.nn=Y -Dplatform=omm -DnumCudaDevices=Z [-D<property=value>] Thermodynamics --mc
          [-options] <PDB|XYZ>

Note the use of the --mc flag.