Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Version 17 and Version 18 of lonestar

Timestamp:: 07/10/24 10:51:13 (11 months ago)
Author:: Cheng Gong
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

lonestar

-              v17
+              v18
 where JOBID is the ID of your job (indicated in the Matlab session). Matlab indicates too the directory of your job where you can find the files `JOBNAME.outlog` and `JOBNAME.errlog`. The outlog file contains the informations that would appear if you were running your job on your local machine and the errlog file contains the error information in case the job encounters an error.
+== Running PINNICLE on Lonestar6  ==
+Lonestar supports container by a software called `apptainer` [https://apptainer.org]. A precompiled image with Tensorflow v.2 backend is available at `docker://chenggongdartmouth/pinnicle_ls6:v0.1`
+You need to build this apptainer image from the Docker on Lonestar6.
+First, irst create an interactive session in LS6's `gpu-a100-dev` or `gpu-a100`  queue:
+{{{
+idev -t 1:00:00 -N 1 -n 4 -p gpu-a100-dev
+}}}
+You will need to load `cuda` and `apptainer` module as follows
+{{{
+module load cuda/11.4 cudnn/8.2.4 nccl/2.11.4
+module load tacc-apptainer
+}}}
+Move to your `<YOUR_WORKING_PATH>` directory on Lonestar6, it is in the format of `/work/xxxxx/yourname/ls6`
+Build the apptainer image from the Docker **with** `--nv`
+{{{
+apptainer build --nv <YOUR_WORKING_PATH>/<YOUR_IMAGE_NAME> docker://chenggongdartmouth/pinnicle_ls6:v0.1
+}}}
+After building the image, you can run this Docker image by
+{{{
+apptainer shell --nv <YOUR_WORKING_PATH>/<YOUR_IMAGE_NAME>
+}}}
+You can also submit a job in the queue with the following script:
+{{{
+#!/bin/bash
+#SBATCH -J job_name           # job name
+#SBATCH -o output.%j          # output file named, output.jobID
+#SBATCH -e error.%j           # error file named, error.jobID
+#SBATCH -p gpu-a100           # queue name
+#SBATCH -N 1                  # number of nodes requested
+#SBATCH --ntasks-per-node 4   # tasks per node
+#SBATCH -t 10:00:00           # time, hh:mm:ss
+#SBATCH --mail-user=<EMAIL_ADDRESS>
+#SBATCH --mail-type=all
+module load cuda/11.4 cudnn/8.2.4 nccl/2.11.4
+module load tacc-apptainer
+apptainer exec --nv <YOUR_WORKING_PATH>/<YOUR_IMAGE_NAME> python test.py
+}}}