Konsti
Structural
- May 11, 2021
- 23
Hello,
Anyone with high performance computing experience here? I am executing a large Abaqus/Explicit model in a cluster using the following command:
bsub -J Abaqus_solve -W 4:00 -n 5 -R "rusage[mem=4096]" abaqus job=3Dframe input=3Dframe.inp cpus=5
The memory setting is 80% by default so I'm not explicitly defining it. Note that due to the LSF queue system I have to request cpus and memory from the HPC first (-n, -R), and then from abaqus after (cpus). Also note that the LSF memory request is per core.
When I use one core (n=1 and cpus=1) the analysis runs without issues. However, when I use parallel processing I sometimes (one in ten) get a memory error: apparently Abaqus is asking for 25% more memory than what I have allocated in the bsub option (-R). This is very bizzare and I cannot reproduce it at will; it seems random. I have tried setting memory="90%" in the abaqus command line, but still no change to this issue. I will now try using the following options:
bsub -J Abaqus_solve -W 4:00 -n 5 -R "rusage[mem=4096]" abaqus job=3Dframe cpus=5 parallel=domain domains=5 dynamic_load_balancing=off mp_mode=threads memory="20480"
Changes from last attempt: not specifying input file (done automatically), setting domain parallelisation with mp_mode=threads and number of domains=number of cores, turning off dynamic load balancing (since I have predefined fields in my analysis, which are not supported by DLB), and setting memory explicitly to 20GB (5 cores*4GB each).
Is there something awfully wrong that I am doing here?
Thank you in advance for your help,
Konstantinos
Anyone with high performance computing experience here? I am executing a large Abaqus/Explicit model in a cluster using the following command:
bsub -J Abaqus_solve -W 4:00 -n 5 -R "rusage[mem=4096]" abaqus job=3Dframe input=3Dframe.inp cpus=5
The memory setting is 80% by default so I'm not explicitly defining it. Note that due to the LSF queue system I have to request cpus and memory from the HPC first (-n, -R), and then from abaqus after (cpus). Also note that the LSF memory request is per core.
When I use one core (n=1 and cpus=1) the analysis runs without issues. However, when I use parallel processing I sometimes (one in ten) get a memory error: apparently Abaqus is asking for 25% more memory than what I have allocated in the bsub option (-R). This is very bizzare and I cannot reproduce it at will; it seems random. I have tried setting memory="90%" in the abaqus command line, but still no change to this issue. I will now try using the following options:
bsub -J Abaqus_solve -W 4:00 -n 5 -R "rusage[mem=4096]" abaqus job=3Dframe cpus=5 parallel=domain domains=5 dynamic_load_balancing=off mp_mode=threads memory="20480"
Changes from last attempt: not specifying input file (done automatically), setting domain parallelisation with mp_mode=threads and number of domains=number of cores, turning off dynamic load balancing (since I have predefined fields in my analysis, which are not supported by DLB), and setting memory explicitly to 20GB (5 cores*4GB each).
Is there something awfully wrong that I am doing here?
Thank you in advance for your help,
Konstantinos