Scientific Cluster Deployment & Recovery
Using puppet to simplify cluster management

V. Hendrix1, D. Benjamin2, Y. Yao1

1Lawrence Berkeley National Laboratory, Berkeley, CA, USA 2Duke University, Durham, NC, USA

New Tier3g Site Setup

Before Start

Connect Network Cables

  • Connect your network like in the graph below:
    Screen_shot_2010-07-02_at_1.55.17_PM.png
  • For Head, Interactive, NFS
    • connect eth0 to your outside network
    • connect eth1 to the internal network just for tier3
  • For Workers:
    • connect eth0 to your internal network just for tier3

Create USB Key

Prepare Bootable USB Key

Creating a bootable USB Key
  1. Get Disk one ISO image of Scientific Linux installation CD or DVD.
  2. Create a USB Key
    • mount iso image as a loop device
      mkdir /tmp/loop
      mount -o loop SL.55.051810.DVD.x86_64.disc1.iso /tmp/loop
      
    • Copy diskboot image file from mounted ISO to USB drive
      dd if=/tmp/loop/images/diskboot.img of=/dev/sdx # where 'x' is the device representing the USB stick
      

Generate Kickstart Files in the USB Key

* Checkout the kickstart package:

svn export http://svnweb.cern.ch/guest/atustier3/ks/tag/ks-0.1 ks  

Create the configuration files for your cluster

cd ks
./generateScripts

  1. Basic Configuration
    Customize parameters.py mentioned in the output of the previous action.This is where you put your hostname, ipaddress etc. The parameters.py file should be self explanatory but let me know if it isn't.

    Please note that this process makes an assumption that the INTERACTIVE and WORKER nodes have a starting ip address in the private subnet and increment for each successive node. If this is an issue, you can make the changes

    vi ./mytier3/src/parameters.py
  2. Generate all kickstart files and other necessary files
    ./generateScripts 
    ls ./mytier3 # you should see all the generated files
    
    cd ./mytier3
  3. Copy kickstart and other necessary files to your USB key
    cp -R /path/to/ks /path/to/usb/mount/ks  
    

Install Physical HEAD Node

The head node is the gateway and it will contain virtual machines for the PUPPET, PROXY and LDAP nodes. The HEAD node also has an Http server used for network installations of the other nodes in the cluster.

Kickstart Install

Boot into the USB stick and type:

linux ks=hd:sdx/ks/mytier3/kickstart-head.cfg

Note to replace head with your real file name for the nfs node, normally the domain name of the head node.

Replace sdx with the drive name of your usb disk. Normally it is the one after all your SATA harddisk. E.g. if you have 4 hard disks, your usb will be sde

Click Ignore Drive when prompted, so the installer does not format the USB drive.

Configure HEAD

  1. Copy configuration files from USB key
    mkdir /mnt/usb
    mount /dev/sdx /mnt/usb # where 'x' is the letter of your usb key
    export AT3_CONFIG_DIR=/root/atustier3 # or /root/working if you prefer
    mkdir -p $AT3_CONFIG_DIR
    cp -r /mnt/usb/ks $AT3_CONFIG_DIR
    cd $AT3_CONFIG_DIR
  2. Configure HEAD and install PUPPET

Install VM LDAP

  • Terminal on HEAD as root. It is important to use -X which enabling X11 forwarding to allow the "vm_ldap Virt Viewer" to load in your local environment.
    ssh -X head ## where 'head' is the name of your head node.
    cd $AT3_CONFIG_DIR/mytier3
    ./crvm-vmldap.sh  
    
    Once the installation of the puppet server finishes. Close the X-windows session OR Ctr-C on the head node
    
    virsh autostart vm_vmldap # puts symlink to the xml file for the VM so that 
                              # the VM can be restarted if the head node is rebooted
  • If you would like to reattach to VM LDAP after viewing session has been cancelled:
    ssh -X head 
    virt-viewer --connect qemu:///system vm_vmldap
    

Install VM PROXY

  • Terminal on HEAD as root. It is important to use -X which enabling X11 forwarding to allow the "vm_proxy Virt Viewer" to load in your local environment.
    ssh -X head ## where 'head' is the name of your head node. 
    cd $AT3_CONFIG_DIR/mytier3 
    ./crvm-vmproxy.sh
    
    Once the installation of the puppet server finishes. Close the X-windows session OR Ctr-C on the head node
    
    virsh autostart vm_vmproxy # puts symlink to the xml file for the VM so that 
                              # the VM can be restarted if the head node is rebooted
  • If you would like to reattach to VM LDAP after viewing session has been cancelled:
    ssh -X head
    virt-viewer --connect qemu:///system vm_proxy
    

Install and configure the rest of the cluster

  1. NFS node
    The other nodes are clients of the NFS service so this node should be up and configured with puppet before continuing with the other nodes.
  2. Configure HEAD node with puppet by following these instructions: Create certificates for puppet client
  3. Install INTERACTIVE nodes
  4. Install WORKER nodes

Existing Tier3g Site Setup

Follow these instructions to configure a Tier 3 puppet server and run the puppet clients against the nodes

On the HEAD Node

ssh -X root@head
yum -y --enablerepo=epel-testing install puppet
yum install python-setuptools
easy_install simplejson

Generate kickstart files for nodes

The script here creates kickstart files which are obviously unnecessary for retrofitting existing Tier 3 sites with puppet

Checkout the kickstart package in a working directory.

export AT3_CONFIG_DIR=/root/atustier3 # or /root/working if you prefer
mkdir -p $AT3_CONFIG_DIR
cd $AT3_CONFIG_DIR
svn export http://svnweb.cern.ch/guest/atustier3/ks/trunk ks

Create the configuration files for your cluster

cd ks
./generateScripts

  1. Basic Configuration
    Customize parameters.py mentioned in the output of the previous action.This is where you put your hostname, ipaddress etc. The parameters.py file should be self explanatory but let me know if it isn't.

    Please note that this process makes an assumption that the INTERACTIVE and WORKER nodes have a starting ip address in the private subnet and increment for each successive node. If this is an issue, you can make the changes

    vi ./mytier3/src/parameters.py
  2. Generate all kickstart files and other necessary files
    ./generateScripts 
    ls ./mytier3 # you should see all the generated files
    
    cd ./mytier3
  3. Configure HEAD and install PUPPET

On Any WORKER Node

Install puppet

yum -y --enablerepo=epel-testing install puppet
Now you can run puppet on the WORKER nodes by following Create certificates for puppet clients

Supplementary Installation Notes

Configure HEAD and install PUPPET

  1. Minimally configure the HEAD node before installing the puppet server
    ./apply-puppet.sh head-init.pp # where head is the name of your head node
  2. Kickstart installation of puppet VM on HEAD node
    Instructions to come. The following script makes the following assumptions
    • There are 60GB of free disk space
    • There is 2GB of ram free
      ./crvm-puppet.sh
      
      Once the installation of the puppet server finishes. Close the X-windows session OR Ctr-C on the head node
      
      virsh autostart vm_puppet # puts symlink to the xml file for the VM so that 
                                # the VM can be restarted if the head node is rebooted
  3. On the PUPPET Node
    • Start puppetmaster
      service puppetmaster start
    • Check that startup was successful by looking at the logs
      tail -f /var/log/messages

Create certificates for puppet clients

  1. On the PUPPET client
    First run puppet on the puppet client. This will create a request with the puppet CA and wait for 30 seconds before trying
    puppetd --no-daemonize --test --debug --waitforcert 30
  2. On the PUPPET server
    Now sign the request
    puppetca --list  # this will tell you of the waiting requests
    puppetca --sign puppetclient
  3. On the PUPPET client
    You should see the puppet agent startup after 30 seconds and run successfully. After you have confirmed that the puppet client runs successfully, do the following:
    chkconfig puppet on
    service puppet start
  4. On the PUPPET server
    You should turn on the puppetmaster service
    chkconfig puppetmaster on

Puppet Server SETUP during kickstart installation

The following is how the puppet server is setup during the kickstart installation of the puppet VM

#########################
# Puppet Configuration
cd /etc/puppet
mkdir modules

# Checkout puppet definitions for the whole cluster
svn export http://svnweb.cern.ch/guest/atustier3/puppet/at3moduledef/trunk at3moduledef
svn export http://svnweb.cern.ch/guest/atustier3/puppet/puppetrepo/trunk puppetrepo

# Checkout all modules for use in the WORKER nodes only
# #       AUTOMATIC CHECKOUT WITH puppetrepo.py
python puppetrepo/puppetrepo.py --action export --moduledef=/etc/puppet/at3moduledef/modules.def --moduledir=/etc/puppet/modules/ --modulesppfile=/etc/puppet/manifests/modules.pp --loglevel=info

cd /etc/puppet
cp at3moduledef/auth.conf at3moduledef/fileserver.conf ./
cp at3moduledef/site.pp manifests/

## Copy config files over to puppet server from HEAD node
wget -O /etc/puppet/manifests/nodes.pp http://192.168.100.1:8080/nodes.pp
wget -O /etc/puppet/modules/at3_pxe/templates/default.erb http://192.168.100.1:8080/pxelinux.cfg.default

chown -R puppet:puppet /etc/puppet
chmod  -R g+rw /etc/puppet

Updating Puppet Modules

You perform these commands to update the puppet modules from the SVN repository

cd /etc/puppet
svn export --force http://svnweb.cern.ch/guest/atustier3/at3moduledef/trunk at3moduledef

python puppetrepo/puppetrepo.py --action export --moduledef=/etc/puppet/at3moduledef/modules.def --moduledir=/etc/puppet/modules/ --modulesppfile=/etc/puppet/manifests/modules.pp --loglevel=info --svnopts=--force

Checking your configuration files into BNL usatlas-cfg

  1. Get access to the SVN Repository
    • Send your Grid Certificate for your
    • Distinguised Name (DN) to
    • Doug Benjamin <benjamin@phy.duke.edu>
  2. Create .p12 file for subversion crentialed access
    • Convert cacert.pet, usercert.pem and userkey.pem from your Grid Certificate into a PKCS#12 file
      openssl pkcs12 -export -in usercert.pem -inkey userkey.pem -certfile cacert.pem -name "[Friendly Name]" -out user-cert.p12
  3. These steps are to make sure that a password is never stored in the clear.
    emacs -nw ~/.subversion/servers
    add the following sections
    
    [global]
    store-passwords = yes
    store-plaintext-passwords = no
    store-ssl-client-cert-pp-plaintext = no
    
    [groups]
    usatlas = svn.usatlas.bnl.gov
    
    [usatlas]
    ssl-client-cert-file = /path/to/your/user-cert.p12

Setup 1st LDAP User "atlasadmin"

#LDAP Configuration atlas tier3

#Login to any node (with a public IP)

cat > tt  <<EOF

dn: dc=mytier3,dc=com
ObjectClass: dcObject
ObjectClass: organization
dc: mytier3
o : mytier3

dn: ou=People,dc=mytier3,dc=com
ou: People
objectClass: top
objectClass: organizationalUnit

dn: ou=Group,dc=mytier3,dc=com
ou: Group
objectClass: top
objectClass: organizationalUnit

dn: cn=ldapusers,ou=Group,dc=mytier3,dc=com
objectClass: posixGroup
objectClass: top
cn: ldapusers
userPassword: {crypt}x
gidNumber: 9000

dn: cn=atlasadmin,ou=People,dc=mytier3,dc=com
cn: atlasadmin
objectClass: posixAccount
objectClass: shadowAccount
objectClass: inetOrgPerson
sn: User
uid: atlasadmin
uidNumber: 1025
gidNumber: 9000
homeDirectory: /export/home/atlasadmin
userPassword: {SSHA}MQstDGq3bTK1Fle+iAa+p4jYgeyl1RIG

EOF

ldapadd -x -D "cn=root,dc=mytier3,dc=com" -c -w abcdefg -f tt -H ldap://ldap/

ldapsearch -x -b 'dc=mytier3,dc=com' '(objectclass=*)' -H ldap://ldap/

mkdir /export/home/atlasadmin;chown atlasadmin:ldapusers /export/home/atlasadmin

Test if condor works:

Login as atlasadmin in int1:

cat > simple.c <<EOF
#include <stdio.h>

main(int argc, char **argv)
{
int sleep_time;
int input;
int failure;

if (argc != 3) {
printf("Usage: simple <sleep-time> <integer>\n");
failure = 1;
} else {
sleep_time = atoi(argv[1]);
input = atoi(argv[2]);

printf("Thinking really hard for %d seconds...\n", sleep_time);
sleep(sleep_time);
printf("We calculated: %d\n", input * 2);
failure = 0;
}
return failure;
}
EOF

gcc -o simple simple.c


cat > submit <<EOF
Universe = vanilla
Executable = simple
Arguments = 4 10
Log = simple.log
Output = simple.out
Error = simple.error
Queue
EOF

condor_submit submit

Kickstart Install a Node

There are several ways to perform a kickstart installation once you have the HEAD node up and minimally configured. Choose the one that fits your machines.
  1. USB Key
    You may duplicate the USB key to install the rest of the nodes in parallel. Using the previously made USB key boot into the machine you are installing. Use the code below. Replace xxx with the short hostname of the node:
    linux ks=hd:sdx/ks/mytier3/kickstart-xxx.cfg
    After reboot, Create certificates for puppet clients

  1. PXE Install NFS, WORKER and INTERACTIVE Nodes
    If your machines are PXE capable. You may enable a PXE boot in the Bios. Make sure that ethernet cable for the PXE boot is connected to the private network. Otherwise you will not be able to connect to the HEAD node which is the PXE Server.
    • Enable and Boot via PXE
    • Choose from menu which node it is, it will the automatically kickstart install the node.
    • After reboot, Create certificates for puppet clients
    • Check /var/log/messages for error messages


Major updates:
-- ValHendrix - 16-Sep-2011

  • Screen_shot_2010-07-02_at_1.55.17_PM.png:
    Screen_shot_2010-07-02_at_1.55.17_PM.png
Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r11 - 2011-12-01 - ValHendrix
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback