Project Update: Network mount separation

Executive Summary (Request For Action)

We're working to separate workstation mounts from HPC mounts, so that heavy usage under LSF won't affect responsiveness of your workstation. We pushed out some changes today that'll require everything to remount, so please log out tonight.

We ask users of Linux workstations to log out tonight so that network mounts can be refreshed.

Project details can be found here:  ITDEV-1917 - Getting issue details... STATUS

Details for those who want to know

There is a new DNS alias "ces-workstation" and its two IP addresses

If you want to make these changes effective right away, and you are a member of the "info" LDAP group, read on and follow these instructions. Otherwise, log out of your workstation at the end of the day today and we'll take care of this for you.

Today we made configuration changes to our "Cluster Export Services" (CES) nodes that made a new pair of IP addresses available. We then deployed configuration changes to workstations to make use of these new IP addresses.

The DNS alias "ces-workstation" maps to two new IPs:

-> host ces-workstation
ces-workstation.gsc.wustl.edu has address 10.100.3.207
ces-workstation.gsc.wustl.edu has address 10.100.3.206

These two IPs are served by the host named "ces3".

The rest of the CES cluster, hosts ces1 and ces2, use 6 other IPs:

-> host ces
ces.gsc.wustl.edu has address 10.100.3.200
ces.gsc.wustl.edu has address 10.100.3.201
ces.gsc.wustl.edu has address 10.100.3.202
ces.gsc.wustl.edu has address 10.100.3.203
ces.gsc.wustl.edu has address 10.100.3.204
ces.gsc.wustl.edu has address 10.100.3.205

Check what server your workstation is using for network mounts

A typical workstation might have several NFS mounts present. Until today, they would have used any of the 6 IPs served by the CES cluster. Here's an example where two data mounts are served by the IP "10.100.3.201".

-> mount -t nfs | egrep "10.100.3.20[0-5]"
gpfs-aggr3.gsc.wustl.edu:/vol/aggr3 on /vol/aggr3 type nfs (rw,tcp,nfsvers=3,addr=10.100.3.201)
ces201:/vol/aggr3/sata130 on /gscmnt/sata130 type nfs (rw,intr,tcp,nfsvers=3,mountproto=tcp,sloppy,addr=10.100.3.201)
ces201:/vol/aggr50/gc5002 on /gscmnt/gc5002 type nfs (rw,intr,tcp,nfsvers=3,mountproto=tcp,sloppy,addr=10.100.3.201)

Note that /gsc and /gscuser go to the home-app cluster, not the CES cluster.

Unmount existing NFS mounts so that they can be remounted with the new IPs

Members of the "info" LDAP group have permission to use "sudo" to run the umount command.

-> groups | grep -q info && echo "YES! I'm in 'info'" || echo "No, I am not in 'info'"
YES! I'm in 'info'

Use sudo to unmount NFS mounts:

-> sudo umount -t nfs -a
umount.nfs: /gscmnt/gc5002: device is busy
umount.nfs: /gscuser: device is busy
umount.nfs: /gsc: device is busy

Note that /gscuser will reply "busy" because an active user login (you) will have files open in /gscuser, so it will not be unmounted. That's ok, we are not concerned about /gscuser.

If you have running programs on /gsc it will report "busy" as well, but we don't care about /gsc right now either.

If any /gscmnt point responds "busy" you need to find a program running that's making use of a file on those mount points and stop that program so that the mount point can be released.

Once this command returns nothing, you have properly cleaned the mounts:

-> mount -t nfs | grep ces

Automount will automatically mount a desired mount point from the new location

Now take a look in some directory you want to use:

-> ls -l /gscmnt/sata130
total 0
-rw-r--r-- 1 root root 0 2007-01-23 12:54 DISK_TECHD
drwxrwsr-x 2 root techd 512 2017-02-09 21:29 techd

That mount wasn't there before. Automount went and mounted for you. Now look to see what server IP we're using:

-> mount -t nfs | grep ces
ces-workstation:/vol/aggr3/sata130 on /gscmnt/sata130 type nfs (rw,intr,tcp,nfsvers=3,mountproto=tcp,sloppy,addr=10.100.3.207)

Here we see the desired new name and IP, ces-workstation on 10.100.3.207.

Now this workstation is using the ces3 server, and load produced by HPC jobs will have a lesser impact on performance.