ECDF Analysis Readiness

Worker nodes

  • Complete middleware migration to SL5 (done)
  • Troubleshoot package dependency issues with ATLAS software post SL5 migration (done)

Jobmanager

Primary CE:

  • Restrict site usage to LHC VOs (done)
  • Prioritise atlas production and pilot group and nominated atlas test users in SGE fairshare (done)
  • Temporary pilot mapping to atlas pool account users (done)
  • Pilot mapping and home directory setup for dedicated analysis pilot users (in progress)

Secondary CE for load balancing

  • Setup/configure additional hardware and open access through external routers (done)
  • Configure lcgsge jobmanager for ECDF access (done)
  • Job manager validation tests - internal (done) - external (in progress)
  • Pilot / ganga job submission (done)
  • Verify correct information publishing / verify all services and configuration (in progress)
  • Monitor service balancing

Software

  • Install all required ATLAS releases (done)
  • Troubleshoot previous ATLAS software issues at site (done - to be verified)
  • Clean up legacy software tags (done)
  • Resolve libglobus issues for ATLAS analysis jobs (done)
  • Migrate ATLAS software to dedicated shared filesystem space
  • Configuration to point to Glasgow SQUID cache (in progress- waiting for glasgow network people to open firewall)

Storage

Current (DPM):

  • Reallocate additional storage from other VOs to ATLAS (done)
  • Allocate additional storage to spacetokens ATLASMCDISK/ DATADISK set up SCRATCHDISK (done)
  • Subscribe MC to ECDF site (done)
  • Reorder data on pool node disks to balance resources. (done)

High Perfomance (StoRM/GPFS):

  • Install StoRM node (done)
  • Enable access to test GPFS space through existing pool nodes (done)
  • Test in production environment (as second SRM) (done)
  • Test with sample analysis jobs (done)
  • Test under load
  • Expand to larger GPFS area
  • Performance testing against DPM.

Analysis Load Testing

  • Submit Athena analysis jobs to site via ganga (done)
  • Diagnose and repair any initial misconfiguration (done)
  • Hammercloud test 1 (WMS only) (done)
  • Identify principle bottlenecks from first Hammercloud iteration (done)
  • Complete test 1 evaluation task list
  • Hammercloud test 2 (WMS + Panda @ nominal working rate)
  • Evaluate performance and close further bottlenecks
  • Hammercloud test 3 (WMS + Panda @ high rate)
  • Certify site for maximum fairshare analysis load.
Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2010-01-21 - WahidBhimji
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback