Getting an account
more to come...
ssh configuration
You can add the following lines to ~/.ssh/config
on your local machine:
Host frontera.tacc.utexas.edu frontera HostName frontera.tacc.utexas.edu User YOURUSERNAME
and replace YOURUSERNAME
by your TACC username.
Make sure to also include the following in your ~/.ssh/config
:
Host * ControlMaster auto ControlPath ~/.ssh/sockets/%r@%h-%p ControlPersist yes
It will allow you to use existing ssh connection for multiple sessions. As long as you have an active connection, ssh will not need a password or token to create a new session. You may need to mkdir ~/.ssh/sockets
if this directory does not exist.
Once this is done, you can ssh frontera by simply doing:
ssh frontera
Environment
On frontera, add the following lines to ~/.bashrc
or ~/.bash_login
:
export ISSM_DIR=PATHTOTRUNK source $ISSM_DIR/etc/environment.sh module load intel/23.1.0 module load impi/21.9.0 module load petsc/3.21
Log out and log back in to apply this change.
Installing ISSM on frontera
frontera will only be used to run the code, you will use your local machine for pre and post processing, you will never use frontera's matlab. You can check out ISSM and install the following packages:
- m1qn3
Use the following configuration script (adapt to your needs):
export CC=mpicc export CXX=mpicxx export FC=mpifort ./configure \ --prefix=$ISSM_DIR \ --with-wrappers=no \ --with-mpi-include="$TACC_IMPI_INC" \ --with-mpi-libflags="-L$TACC_IMPI_LIB/release_mt -lmpi -lmpifort -lifcore" \ --with-petsc-dir="$TACC_PETSC_DIR" \ --with-petsc-arch=$ISSM_ARCH \ --with-metis-dir="$TACC_PETSC_DIR" \ --with-mkl-libflags="-L$TACC_MKL_LIB -qmkl=parallel" \ --with-mumps-dir="$TACC_PETSC_DIR" \ --with-scalapack-dir="$TACC_PETSC_DIR" \ --with-m1qn3-dir="$ISSM_DIR/externalpackages/m1qn3/install" \ --with-cxxoptflags="-g -O3 -std=c++11 -fp-model=precise" \ --enable-debugging \ --enable-development
Installing ISSM with Matlab on frontera
If you want to use frontera to process the model after running it and keep the data on frontera, you can install ISSM with Matlab interface. Before doing the following steps, you should check with TACC to make sure you have the access to Matlab on frontera, see: https://docs.tacc.utexas.edu/software/matlab/
You can check out ISSM and install the following packages:
- triangle
- m1qn3
You will need to use interactive mode to compile ISSM with Matlab: https://docs.tacc.utexas.edu/software/idev/
Use the following configuration script (adapt to your needs):
./configure \ --prefix=$ISSM_DIR \ --with-matlab-dir="/home1/apps/matlab/2023a/" \ --with-triangle-dir=$ISSM_DIR/externalpackages/triangle/install \ --with-mpi-include="$TACC_IMPI_INC" \ --with-mpi-libflags="-L$TACC_IMPI_LIB/release_mt -lmpi" \ --with-petsc-dir="$TACC_PETSC_DIR" \ --with-petsc-arch=$ISSM_ARCH \ --with-metis-dir="$TACC_PETSC_DIR/$PETSC_ARCH" \ --with-mkl-libflags="-L$TACC_MKL_LIB -mkl=parallel" \ --with-mumps-dir="$TACC_PETSC_DIR/$PETSC_ARCH" \ --with-scalapack-dir="$TACC_PETSC_DIR/$PETSC_ARCH" \ --with-m1qn3-dir="$ISSM_DIR/externalpackages/m1qn3/install" \ --enable-debugging \ --enable-development
You need to compile ISSM serially with make install
.
Remember frontera is a remote cluster, use matlab -nodesktop -nosplash -r "addpath $ISSM_DIR/src/m/dev; devpath;
when running Matlab.
Before downloading the .outbin
, you will need to set
`
md.cluster.name = oshostname()
md.miscellaneous.name = the_file_name_of_outbin
`
Then, run
md=loadresultsfromcluster(md, 'runtimename', the_folder_name_in_execution)
frontera_settings.m
You have to add a file in $ISSM_DIR/src/m
entitled frontera_settings.m
with your personal settings on your local issm install:
cluster.login='seroussi'; cluster.codepath='/home1/03729/seroussi/trunk-jpl/bin/'; cluster.executionpath='/work/03729/seroussi/trunk-jpl/execution/';
use your username for the login
and enter your code path and execution path. These settings will be picked up automatically by matlab when you do md.cluster=frontera()
Note that the `executionpath' creates temporary binary files that can be removed once the job is complete. For this reason, you can set the path to be somewhere on the $SCRATCH filesystem, which is unlimited temporary storage on frontera.
Running jobs on frontera
On frontera, each node has 56 cores and you can use any multiple of 56 for the total number of processors. The more nodes and the longer the requested time, the more you will have to wait in the queue.
The most commonly used queues are normal
, development
, and flex
.
Check more details here: https://docs.tacc.utexas.edu/hpc/frontera/#queues
Note: in order to use normal
queue, you will have to request at least 3
nodes.
So choose your settings wisely:
md.cluster=frontera('numnodes',3);
Before you run your job, make sure to have an active open ssh connection to frontera so that you don't need to enter your password.
To manually submit a job on frontera, do:
sbatch job.queue
Now if you want to check the status of your job and the queue you are using, type in the bash with the frontera session:
showq -u USERNAME
You can delete your job manually by typing:
scancel JOBID
where JOBID is the ID of your job (indicated in the Matlab session). Matlab indicates too the directory of your job where you can find the files JOBNAME.outlog
and JOBNAME.errlog
. The outlog file contains the informations that would appear if you were running your job on your local machine and the errlog file contains the error information in case the job encounters an error.
Use DeepXDE
Fontera supports container by a software called apptainer
https://apptainer.org. A precompiled DeepXDE image with Tensorflow v.2 backend is available at docker://chenggongdartmouth/deepxde:v1.2 or at docker://mkrish234/deepxde:v0.3 .
You might need to build an appraiser image from the Docker image in Frontera.
First find the path to your login node on Frontera with pwd
. Then, allocate a compute node with the following
idev -m 60 -p rtx -N 1 -n 8
You will need to load the apptainer
module as follows
module load tacc-apptainer
Build the apptainer image from the Docker image as follows
apptainer build <PATH_TO_LOGIN_NODE>/deepxde docker://<DOCKER_IMAGE>
The following is an example of using DeepXDE to run a script ./test.py on Frontera (with GPU node):
#!/bin/bash #SBATCH -J job_name # job name #SBATCH -o output.%j # output file named, output.jobID #SBATCH -e error.%j # error file named, error.jobID #SBATCH -p rtx # queue name #SBATCH -N 1 # number of nodes requested #SBATCH --ntasks-per-node 4 # tasks per node #SBATCH -t 10:00:00 # time, hh:mm:ss #SBATCH --mail-user=<EMAIL_ADDRESS> #SBATCH --mail-type=all module load tacc-apptainer stdbuf -i0 -o0 -e0 apptainer exec --nv --bind ~/:/mnt ~/deepxde python -u /mnt/test.py