Get confirmation from the user

Get user's confirmation that the machine can actually be reinstalled (via mail).

Put machine into maintenance

  • Login to lxadm
  • sms set maintenance 'os upgrade' lxbuild023

The difference between standby and maintenance in sms, is that standby machines are still monitored and alarms are still sent to the operators.

Update machine's CDB profile

  • Login to lxadm

$ cdbop
cdbop> get profiles/profile_lxbuild023.tpl
cdbop> !vi profiles/profile_lxbuild023.tpl

  • Replace pro_type_lxbuild_slc3 with pro_type_lxbuild_x86_64_slc4
  • And commit:

cdbop> update profiles/profile_lxbuild023.tpl
cdbop> commit

The commit can take up to 30 minutes... (!)

PrepareInstall on machine

On lxadm, run PrepareInstall lxbuild023

You might get the following message:

[lxadm01] ~ > PrepareInstall lxbuild023
Conflicting values for OS found in CDB ("slc4") and CDBSQL ("slc3")
Note that it can take up to 30 minutes to synchronize CDBSQL with CDB.
[ERR] Not all necessary information could be found, exiting...

In this case, the only thing to do is to wait for the synchronization to happen...

Reboot the machine

On the machine itself: shutdown -r now

Takes ~1 hour (2 reboots are actually done).

Check the ongoing reinstallation

Two different ways to do that:

  • Try to ssh to the machine

$ ping lxbuild023
$ ssh lxbuild023
$ tail -f ksxxx.log

  • From the console

On lxadm:

$ /afs/cern.ch/group/c3/bin/connect2console.sh lxbuild023
$ tail -f /var/log/ncm/ncd.log

To exit the console: Ctrl-e c . To reboot a machine: Ctrl-e c l 1 b

Check that the reinstallation went ok

On the machine:

$ lemon-host-check
$ df -k

The "df -k" should show a /build partition on the build machines...

Clear the machine from maintenance state

Warning: do not set the machine to 'production' state, but clear it from the 'maintenance' state.

Like that, the machine:

  • goes back to the previous state it was in (ex: standby),
  • or goes back to its default state (ex: production)

$ sms clear maintenance "os upgrade" lxbuild023

Troubleshooting

If the machine doesn't get reinstalled after reboot, it can be because:

  • the MAC address given by ifconfig and registered in LANDB are different.

To solve this:

  • Add the MAC address in LANDB (but keep the other one registered).

If this didn't help, create a Remedy Ticket in IT-CM: File > Open > Object List > ITCM:CallManagement

Double click on Search (the upper part banner), it should change into New. Enter the Category, Host, etc. And Save the ticket, it will actually be created like that...

In the end, the BIOS settings needed to be corrected to boot on the network.


-- SophieLemaitre - 13 Dec 2007

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2007-12-14 - SophieLemaitre
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback