The flexible mesh based engines utilize the following parallelization techniques in a hybrid manner:
When launching a flexible mesh based engine the user has the possibility to specify the number of subdomains and the number of threads per subdomain for which the simulation should be run. By dividing the model domain in a number of smaller subdomains the computational work is divided among a number of processors as described in the MPI computing approach. The number of threads per subdomain specifies the number of OpenMP threads to utilize for each subdomain. Hence, this is a hybrid OpenMP + MPI parallelization technique.
Furthermore, if the computer has a supported graphics card, it is possible to utilize the GPU computing approach. When selecting the "Use GPU" checkbox, the user is presented with a list of supported graphics cards on the computer. If more than one graphics card is available and ticked off, then the simulation will run using multiple GPUs. Hence, a hybrid OpenMP + MPI + GPU parallelization technique is used.
Currently, only the computational intensive hydrodynamic calculations and the k-e based turbulence calculations are performed on the GPU. The additional calculations are for each sub-domain performed on the CPU and these calculations are parallelized based on the OpenMP computing approach.
All flexible mesh models, with MIKE 21 Spectral Waves FM and MIKE 3 Wave Model FM being the only exceptions, are capable of utilizing the GPU computing approach. However, only one inundation map covering the whole domain is supported, and only first-order calculations in time and space for the k-e based turbulence formulation are supported.
Generally, neither the specified number of subdomains nor the number of threads per subdomain (nor the product of these) should exceed the number of cores available on the PC, since this will decrease performance.
The output is independent of the number of subdomains and the number of threads per subdomain used for the simulation. If more than one subdomain is used, then intermediate output data is generated for each subdomain during the simulation. At the end of the simulation these output files are automatically merged to one file covering the entire domain. If only one subdomain is used, no merging is necessary. For each subdomain a temporary log file containing the processed information for the given subdomain will be created. These log files are named by the specified setup name combined with the number of the given subdomain, e.g. SimA_p2.log. These will per default be deleted after the simulation finishes. The log file for the first sub domain will have no additional subdomain number, e.g. SimA.log, and will include statistics for the overall simulation as well. This log file will remain after the simulation ends.
PLEASE NOTE:
For the GPU computing approach it is recommended that the number of subdomains does not exceed the number of GPUs used for running the simulation. Only if the AD modules are the time consuming part of the simulation it might be beneficial to specify more subdomains than available GPUs.
PLEASE NOTE:
Some computers have Hyper-Threading. The engines which are part of the flexible mesh modelling system does not benefit from having this enabled. If a Hyper-Threaded computer has 4 cores and thus 8 threads then the engines should be launched with 4 threads.