Hi all,
I recently ported ISSM to a new HPC with Intel oneAPI 2023 and gcc 10. There were a few speed bumps, but it did compile after all. However, ISSM blew up when I tried to run a previously saved model.
I don't have much clue at this moment and would appreciate any suggestion.
Thanks!
Wade
PETSc configuration:
./configure --CFLAGS="-Ofast -diag-disable=10441 -I"${MKLROOT}/include"" --CXXFLAGS="-Ofast -diag-disable=10441 -I"${MKLROOT}/include"" --CPPFLAGS="-Ofast -diag-disable=10441 -I"${MKLROOT}/include"" --FFLAGS="-Ofast -diag-disable=10006" --prefix=$ISSM_DIR/externalpackages/petsc/install --PETSC_DIR=$ISSM_DIR/externalpackages/petsc/src --download-fblaslapack --with-debugging=0 --with-valgrind=0 --with-x=0 --with-ssl=0 --with-shared-libraries=1 --download-metis=1 --download-parmetis=1 --download-mumps=1 --with-blas-lapack-dir="${MKLROOT}" --with-scalapack-include="${MKLROOT}/include" --with-scalapack-lib="-L${MKLROOT}/lib/intel64 -lmkl_scalapack_lp64 -lmkl_cdft_core -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_lp64 -liomp5 -lpthread -lm -ldl" --with-make-np=32 --with-shared-libraries=1 --with-c=mpiicc --with-fc=mpiifort --with-cxx=mpiicpc CC=mpiicc CXX=mpiicpc CPP="mpiicc -E" F77=mpiifort F90=mpiifort
Output:
uploading input file and queueing script
launching solution sequence on remote cluster
Ice Sheet System Model (ISSM) version 4.17
(website: http://issm.jpl.nasa.gov contact: issm@jpl.nasa.gov)
[2]PETSC ERROR: [6]PETSC ERROR: [10]PETSC ERROR: [14]PETSC ERROR: [18]PETSC ERROR: [22]PETSC ERROR: [26]PETSC ERROR: [30]PETSC ERROR: [34]PETSC ERROR: [38]PETSC ERROR: [42]PETSC ERROR: [46]PETSC ERROR: [50]PETSC ERROR: [54]PETSC ERROR: [58]PETSC ERROR: [62]PETSC ERROR: ------------------------------------------------------------------------
[34]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[34]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[34]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[34]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[34]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[34]PETSC ERROR: to get more information on the crash.
[34]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[34]PETSC ERROR: Signal received
[34]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[34]PETSC ERROR: Petsc Release Version 3.12.3, Jan, 03, 2020
[34]PETSC ERROR: /lustre/home/1501110242/ISSM_src/ISSM_src_5.5_30p_refrozen_correct/bin/issm.exe on a named l03c19n4 by 1501110242 Thu Mar 30 22:01:59 2023
[34]PETSC ERROR: Configure options --CFLAGS="-Ofast -diag-disable=10441 -I/lustre/software/oneapi/2023.0/mkl/2023.0.0/include" --CXXFLAGS="-Ofast -diag-disable=10441 -I/lustre/software/oneapi/2023.0/mkl/2023.0.0/include" --CPPFLAGS="-Ofast -diag-disable=10441 -I/lustre/software/oneapi/2023.0/mkl/2023.0.0/include" --FFLAGS="-Ofast -diag-disable=10006" --prefix=/lustre/home/1501110242/ISSM_src/ISSM_src_5.5_30p_refrozen_correct/externalpackages/petsc/install --PETSC_DIR=/lustre/home/1501110242/ISSM_src/ISSM_src_5.5_30p_refrozen_correct/externalpackages/petsc/src --download-fblaslapack --with-debugging=0 --with-valgrind=0 --with-x=0 --with-ssl=0 --with-shared-libraries=1 --download-metis=1 --download-parmetis=1 --download-mumps=1 --with-blaslapack-dir=/lustre/software/oneapi/2023.0/mkl/2023.0.0 --with-scalapack-include=/lustre/software/oneapi/2023.0/mkl/2023.0.0/include --with-scalapack-lib="-L/lustre/software/oneapi/2023.0/mkl/2023.0.0/lib/intel64 -lmkl_scalapack_lp64 -lmkl_cdft_core -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_lp64 -liomp5 -lpthread -lm -ldl" --with-make-np=32 --with-shared-libraries=1 --with-c=mpiicc --with-fc=mpiifort --with-cxx=mpiicpc CC=mpiicc CXX=mpiicpc CPP="mpiicc -E" FC=mpiifort FC=mpiifort
[34]PETSC ERROR: #1 User provided function() line 0 in unknown file
[42]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[42]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[42]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[42]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[42]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[42]PETSC ERROR: to get more information on the crash.