Parasol job control system: Difference between revisions
(Parasol installation) |
(add link to script) |
||
Line 61: | Line 61: | ||
These commands need to be run as the '''root''' user as verified by the small bash script | These commands need to be run as the '''root''' user as verified by the small bash script | ||
[[File:ParasolInstall.sh.txt]]: | |||
#!/bin/bash | #!/bin/bash |
Revision as of 21:04, 23 March 2018
Introduction
The parasol job control system is used to manage a multiple CPU/core compute cluster. It can also be used on a single computer that has multiple CPUs/cores.
It is the cluster control system expected by the U.C. Santa Cruz Genomics Institute genomics toolsets and processing pipelines such as building new genome browsers, computing lastz/chain/net pair-wise alignments, multiple alignments of genome assemblies, and many other bioinformatics processing toolsets.
Computer organization
One computer (or one CPU of one computer) is allocated to the task of managing the compute jobs. This is referred to as the parasol hub machine.
The other computers in the cluster are allocated to the task of running the compute jobs. These computers are referred to as the parasol node machines.
Compute jobs to the system are managed with parasol commands on the hub machine. The compute jobs are assigned to the node machines by the parasol processes running on the hub machine.
A single machine can be used as both the hub controller and as a node task machine as long as two CPU cores are reserved, one for the operating system and one for the hub processes, with the extra CPU cores allocated to the node task resource pool.
SSH keys
The hub machine needs to communicate with the node machines via UDP network protocol and via ssh commands for setup tasks.
Assuming this is a new machine (such as a cloud machine instance) with no previous ssh operations, the ssh keys can be generated:
echo | ssh-keygen -N "" -t rsa -q
The echo provides an empty answer to the passphrase question that would normally be asked by the ssh-keygen command. This creates two files in $HOME/.ssh/:
-rw-r--r--. 1 397 Mar 22 17:43 id_rsa.pub -rw-------. 1 1675 Mar 22 17:43 id_rsa
The existing $HOME/.ssh/authorized_keys probably already has keys added from the cloud machine management system to allow login to the machine instance. To add this newly generated key to the authorized_keys file without disturbing existing contents:
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
These id_rsa key files will be copied to other machines in this cluster to allow this hub machine access via ssh. This copy just performed will allow this hub machine to access itself via ssh in the case it is also used as a node machine.
Parasol installation
This example is going to install everything in a constructed directory hierarchy: /data/...
This directory will be exported as an NFS filesystem for access by the node machines in this cluster.
These commands need to be run as the root user as verified by the small bash script File:ParasolInstall.sh.txt:
#!/bin/bash export self=`id -n -u` if [ "${self}" = "root" ]; then printf "# creating /data/ directory hierarchy and installing binaries\n" 1>&2 mkdir -p /data/bin /data/genomes /data/parasol/nodeInfo chmod 777 /data /data/genomes /data/parasol /data/parasol/nodeInfo chmod 755 /data/bin rsync -a rsync://hgdownload.soe.ucsc.edu/genome/admin/exe/linux.x86_64/ /data/bin/ else printf "ERROR: need to run this script as sudo ./parasolInstall.sh\n" 1>&2 fi