How to run the thermal solver on clusters?

This article explains two different methods on how to run the thermal solver on clusters for thermal-flow and multiphysics analyses.

Introduction

Running jobs on a cluster provides enhanced processing power, efficient resource management, parallel processing capabilities, scalability, remote access, and monitoring. The thermal solver uses Distributed Memory Parallel (DMP) implementation to run jobs in parallel on multiple machines. It lets you manage DMP processing using a process scheduler, which organizes the queues and resources to perform the analysis in parallel. To run on clusters, you require the Simcenter 3D Thermal/Flow DMP license.

There are two methods to run the thermal solver on cluster:

  • Using a custom script that executes the thermal solver in parallel mode.
  • Using the dedicated ND argument in the TMG Executive Menu, specifically designed for cluster execution.

Running through a custom script

The most common method to run the thermal solver is through a custom script tailored to the specific scheduler. Typically, when you submit a custom script requesting specific computing resources, it generates a job script with the requested resources and submits it to the job manager. Once the job manager executes the job, it launches the job script. This job script identifies the nodes and CPUs allocated for the job by the job manager and uses this information to generate the parallel configuration file with the correct number of cluster nodes. Finally, it launches the thermal solver using the input file and the generated parallel configuration file.

The TMG thermal-flow solvers installation contains simple scripts for both Slurm (Simple Linux Utility for Resource Management) and PBS (Portable Batch System). You can find these scripts in the tmg/if/scripts directory and use them to launch simulations on Linux clusters. These scripts start the TMG Executive Menu with the tmgnx.com executable.

Note:
The following example shows how to use the Slurm script. The PBS script uses different but equivalent commands.

Running through the dedicated ND argument in the TMG Executive Menu

The second method allows you to run jobs through the TMG Executive Menu. In this mode, the monitor submits a series of cluster jobs that execute sequentially for each solver module. Additionally, parallel modules such as VUFAC and ANALYZ can be executed as separate parallel jobs.

The primary difference between the two methods lies in how resources are allocated, and the parallel configuration file is formed. In the first method, the custom script handles resource allocation and forms the parallel configuration file, all within a single job. In contrast, when using the ND argument, the thermal solver creates the configuration file with the node names and launches separate cluster jobs for each TMG module. Running a simulation through the custom script offers more flexibility, however, requires more scripting skills.

Note:
You cannot run multiphysics thermal-structural analyses using the TMG Executive Menu with the ND argument in parallel on cluster.