Running multiple simulations on a cluster
If you have many simulations you want to run, you can use bin/run_jobs.py
to run them efficiently on a cluster using Dask, either locally or on a slurm or PBS cluster. Using the parallel running option requires the following package to be installed: dask-jobqueue
.
Follow steps 1-4 from Run a simulation with MPET for each of the simulations you want to run. Then create a text file in your working directory containing the system parameter files for your simulations. This text file should contain the file names of each of the system parameter configuration files for which you want to run a simulation. For example, if you have all your parameter files saved in the
configs
directory, create:configs/parallel_configs.txt
, which contains the lines:params_system.cfg params_system_XX.cfg params_system_YY.cfg etc.
Run multiple simulations on a cluster using
run_jobs.py
. The simplest way to run it, is to run the script on the login node. Pass the text file containing the system parameter files (e.g.configs/parallel_configs.txt
) and the cluster arguments:-s
: scheduler type. Options:slurm
,pbs
, andlocal
. Default isslurm
.-t
: Maximum walltime per job (hh:mm:ss format). Argument is not used with a local cluster.-n
: Number of CPU cores and instances of MPET per job. Argument is not used with a local cluster.-m
: Max memory usage per job (e.g. 2GB). When using a local cluster it sets the memory limit per worker process.-q
: Queue to use. Argument is not used with a local cluster.-d
: Port for Dask dashboard (default 4096).--min_jobs
: Minimum number of jobs to launch. Default = 1. Argument is not used with a local cluster.--max_jobs
: Maximum number of jobs to launch. Default = 1. Argument is not used with a local cluster.
The simulation output is the same as described above. For each simulation a separate output folder is created in the
history
folder.