SWAT TODO list, due to 15 April

SWAT Client

  • NEW PATCH: memory file bug fix to push to production. It is already fixed and in SVN
  • FIX: exclude usage of http protocol for sending AMQ messages
  • use YAIM for configuring swat-client. Would be important in future for transition to regional model
    • SWAT_STATUS="on" # or "off". Default is "on"
    • SWAT_MESSAGE_BROKER
    • SWAT_MESSAGE_DESTINATION
    • SWAT_PUBKEY_LOCATION
    • ...?
  • DOCUMENTATION: instructions on how to turn on swat for non-lcg batch systems (maybe from here)
  • TESTS: carefully recheck all tests, make a documentation on it and send mail to someone for discussion

SWAT Server

  • prepare RPM for v0.1.0 version
    • New DB Schema:
      • Create SQL scripts for schema upgrad from version 0.0.18 to 0.1.0
      • Changes:
        • New tables for latest results IDs per test table
        • new column for all tests: executionExitCode
        • added WN-CE Many to Many relation
        • new columns for WN: Updated and Active
        • schedule db job (or cron job in django): Go over all WNs and if now - updated > 30 set Active = 0 else set Active =1
    • swat-nagios-publiser is added. It is to be run by a cron job
      • comment script and add proper logging
    • adding swat-sync-db to cron, (to be run every 5-10 minutes?)
    • DatabaseInserter (test consumer) modified to update "Updated" columns in WNs every time a test for that WN is received, and update the latest ID in WN-test_*_latest table

  • prepare code for v0.2.0 to include the following changes:
    • new views for latest results in a treemap
    • upgrade schema:
      • add new tables for tracking history changes. This will contain the IDs of tests for each WN and each test when relevant "chronic" attributes are changed for the first time. Possible implementations would be to either have one table: (test_name, wn_id, test_id) or to have one table per test. Need to think of a ingenious way to define what attributes are considered chronic
      • maybe instead of using multiple tables for latest id we should create only one with column test_name
      • add "Updated" column to WN-CE relation, so we would know which one of these relations are active
      • add new tables for summaries and statistics on test tables for non-chronic attributes
      • add script to create all relevant indexes that will optimize our queries
    • optimize Thomas queries in order to use new tables (latestID, history, summary, statistics)
  • prepare Documentation:comprehensive guide on how to install and maintain server

  • BUG: from time to time (2-3 times per hour), DatabaseInserter tries to insert already inserted test. We still don't know if it is problem that broker is passing the same message twice, or server doesn't dequeue the message properly after consuming it

Monitoring SWAT with Nagios

  • MONITORING: send email on errors/critical errors, send email when there is no input to database in the last 5-10min, Nagios could test this, cron job every 10 minutes could test for latest input
  • MONITORING: idea for nagios probe:
    • install glite-swat-client on nagios node
    • periodically run swat-job-wrapper with --force option (every hour)
    • get page of that node with latest results, parse the page and check how old are they. Raise alarm if nodes resulsts are more than 3h old, warning if more than 1h old

Miscellanea

  • DOCUMENTATION: Document better on tests, name change, how it works
  • BUGFIX server: logging on consumer does not work unless server is running
  • NEW FEATURE: enable specifying obsoleted tests, do not raise an error but a warning when test is obsoleted

DONE

  • BUGFIX client: SWAT Memory file should be world readable, and each user will have his own file. Test needs to be modified in order to search for newest memory file, and all of them should be stored in /tmp/glite-swat/user_name_swat.memory

-- DanicaStojiljkovic - 14-Jan-2010

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatdocx Next_steps_in_GCM.docx r1 manage 14.9 K 2010-03-05 - 15:32 UnknownUser Original handover from Thomas Low
Unknown file formatpptx SWAT.pptx r1 manage 221.7 K 2010-03-05 - 15:42 UnknownUser SWAT - first handover from Danica
Unknown file formatpptm swat-portal.pptm r1 manage 162.1 K 2010-03-05 - 15:36 UnknownUser Ideas on how swat portal should look like
Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2010-03-11 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback