Changes between Version 2 and Version 3 of andes


Ignore:
Timestamp:
05/01/24 16:15:07 (12 months ago)
Author:
Mathieu Morlighem
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • andes

    v2 v3  
    2323== Password-less ssh ==
    2424 
    25 Discovery **officially** suggests using `GSSAPI` for passwordless access, see [[https://services.dartmouth.edu/TDClient/1806/Portal/KB/ArticleDet?ID=89203|here]]. 
     25Andes **officially** suggests using `GSSAPI` for passwordless access, see [[https://services.dartmouth.edu/TDClient/1806/Portal/KB/ArticleDet?ID=89203|here]]. 
    2626
    2727On your local machine, you will need to enter:
     
    2929kinit -f -l 7d username@KIEWIT.DARTMOUTH.EDU
    3030}}}
    31 with your NetID at `username` and the password for NetID to request a ticket for 7 days (or any time period you need), then you can use {{{ssh discovery}}} without entering a password.
     31with your NetID at `username` and the password for NetID to request a ticket for 7 days (or any time period you need), then you can use {{{ssh andes}}} without entering a password.
    3232
    3333== Environment ==
    3434
    35 On Discovery, add the following lines to `~/.bashrc`:
     35On Andes, add the following lines to `~/.bashrc`:
    3636{{{
    3737#!sh
     
    8888srun --nodes=1 --ntasks-per-node=16 --pty /bin/bash
    8989}}}
    90 == Installing ISSM with CoDiPack (AD) on Discovery ==
     90== Installing ISSM with CoDiPack (AD) on Andes ==
    9191
    9292You will need to install the following additional packages:
     
    123123}}}
    124124
    125 == discovery_settings.m ==
    126 
    127 You have to add a file in `$ISSM_DIR/src/m` entitled `discovery_settings.m` with your personal settings on your local ism install:
     125== andes_settings.m ==
     126
     127You have to add a file in `$ISSM_DIR/src/m` entitled `andes_settings.m` with your personal settings on your local ism install:
    128128
    129129{{{
     
    134134}}}
    135135
    136 use your NetID for the `login` and enter your code path and execution path. These settings will be picked up automatically by matlab when you do `md.cluster= discovery()`
    137 
    138 The file sytem on Discovery is called DartFS (or DarFS-hpc). Your home directory on DartFS is only 50GB, it would be better to use the lab folder which has 1TB:
     136use your NetID for the `login` and enter your code path and execution path. These settings will be picked up automatically by matlab when you do `md.cluster= andes()`
     137
     138The file sytem on Andes is called DartFS (or DarFS-hpc). Your home directory on DartFS is only 50GB, it would be better to use the lab folder which has 1TB:
    139139{{{
    140140#!sh
     
    149149 {{{
    150150#!m
    151 md.cluster= discovery('numnodes',1,'cpuspernode',8);
     151md.cluster = andes('numnodes',1,'cpuspernode',8);
    152152}}}
    153153
     
    156156
    157157Each node has it's own time limit for jobs that are being run from the queue, but they tend to be 10 or 30 days.
    158 You can find the time limit of each node by entering on Discovery:
     158You can find the time limit of each node by entering on Andes:
    159159{{{
    160160#!sh
    161161sinfo
    162162}}}
    163 If you are running something interactively on Discovery, there may be a credential limit for the DartFS system of 10 hours.
     163If you are running something interactively on Andes, there may be a credential limit for the DartFS system of 10 hours.
    164164Read more here: [[https://services.dartmouth.edu/TDClient/1806/Portal/KB/ArticleDet?ID=76691]]
    165165
    166 Now if you want to check the status of your job and the node you are using, type in the bash with the Discovery session:
     166Now if you want to check the status of your job and the node you are using, type in the bash with the Andes session:
    167167 {{{
    168168#!sh
     
    193193If you want to use more than one node (not recommended), the current (temporary) solution is to:\\
    1941941) start the job\\
    195 2) go to Discovery and see which nodes discovery is using (see `squeue` usage below)\\
     1952) go to Andes and see which nodes andes is using (see `squeue` usage below)\\
    1961963) cancel the job (see `scancel` usage below)\\
    1971974) find the .queue script for your run and manually edit the start of the mpirun command to look like: