I encountered the following error while running a high-resolution ice sheet experiment:
call computational core:
iteration 1/10 time [yr]: 1.00 (time step: 1.00)
updating boundary conditions...
computing enthalpy
depth averaging WaterfractionDrainage
extruding WaterfractionDrainageIntegrated from base...
extruding BasalforcingsGroundediceMeltingRate from base...
computing smb
call positive degree day module
computing new velocity
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 1325685 RUNNING AT l04c27n3
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[proxy:0:1@l06c47n4] HYD_pmcd_pmip_control_cmd_cb (pm/pmiserv/pmip_cb.c:878): assert (!closed) failed
slurmstepd: error: Detected 1 oom_kill event in StepId=1910233.0. Some of the step tasks have been OOM Killed.
srun: error: l04c27n3: task 0: Out Of Memory
srun: Terminating StepId=1910233.0
[proxy:0:1@l06c47n4] HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:1@l06c47n4] main (pm/pmiserv/pmip.c:200): demux engine error waiting for event
srun: error: l06c47n4: task 1: Exited with exit code 7
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
loading results from cluster
[Warning:
============================================================
Binary file 440m_X06_Om.geometry.outbin not found!
This typically happens when the run crashed.
Please check for error messages above or in the outlog
============================================================
]
[> In loadresultsfromdisk (line 16)
In loadresultsfromcluster (line 50)
In solve (line 180)
In runme_45 (line 64)]
[Warning: Variable 'md' was not saved. For variables larger than 2GB use
MAT-file version 7.3 or later.]
[> In runme_45 (line 70)]
It is mentioned that there is not enough memory. How should I solve this? Is it a problem with my node's memory or is it caused by some inappropriate settings in the mode? How can I optimize it?