Hello TWikiGuest and welcome to the Oxford ATLAS "useful stuff" page.
The concept of this page is that many PhD students and ATLAS Staff have spend many days finding solutions to what seem like minor issues. These problems detract from the actual time spent doing physics and making the world a better place so we have decided to gather the solutions to common problems together in one place so you can search for an existing solution. This means that if your problem is not listed below and you find a solution PLEASE add it to this page so that when someone else comes across the same issue, your work can benefit them.
If you can't find the solution to a problem here, just ask one of the current students or staff as someone has probably seen the problem before and most of us are nice people.
For most problems the ATLAS Twiki Portal is a good starting point, and the ATLAS software tutorial Twiki has up-to-date 'getting started' information.
We have created this page without any expandable areas to enable you to search for any problem or solution your are concerned with. It may also be useful to check out the pdf version of this page (button at top-right of page) and search there.
Many new PhD students are well versed in particle physics theory (and there are courses in first year to help with this) but have not had much experience with programming before starting the PhD. If this is the case, don't worry. You are in the same boat as most students I know. I strongly recommend taking some time as early as possible to learn a bit about bash, python and C++ as these are the main ATLAS languages. Remember that whatever language you use, good code practise has been designed for a reason - from my own experience you should design every piece of code thoughtfully as many programs that start as "just a quick script" can quickly exploded into very complex code frameworks and you will wish you had followed the rules from the beginning.
Bash is a shell language which is what you are using when you open a terminal and type "ls" or "echo ...". It is great for navigation and file organisation but is fairly low level so most complex tasks are best avoided in bash. There are many tutorials available online and as always with computing the best idea is to dive in and try things - you have to really know what you are doing before you can break the Oxford system so don't be afraid to try things but be sensible and especially be careful with removing files ("rm" command).
(Regarding the 'rm' command: One suggestion is to put an alias in your .bashrc file. This file is run every single time you login, straight away. If you add this line at the bottom:
alias rm="rm -i"
Then from now on, when you type "rm", you will, instead be using "rm -i" and this means Linux will ask you if you really want to delete the given file. any entry other than 'y' or 'yes' will leave the file untouched. My own graduate career was set back by a week due to misapplication of "rm *.*" prior to my adding this alias. It would have been a lot longer than a week but thankfully, there were automatic weekly backups.... )
Python
Python is still a scripting language like bash and is not compiled so is great for performing complex file management or string manipulation. It is much higher level than bash or even C++ so commands are much more intuitive. There are many modules that can be loaded into python but the most important for us is probably PyRoot which allows use of the C++ ROOT classes in python. Be warned that whilst python is easy to use and learn, it has limits such as lack of type-saftey and loops are significantly slower than in C++.
That said, it is my recommended way to learn to use ROOT and just a few lines can make histograms. Set up the relevant environment:
A nice intro to using TTrees can be found here.
The example script below requires an input ROOT file with a TTree called "physics". If you can download your own, go for it but, if you have not learned about "dq2" yet, you can borrow my example ROOT file at "/home/henderson/scripts/OxfordTwiki/MyRootfile.root".
And make a pyroot script called MyScript.py:
#! /usr/bin/env python
import ROOT
ROOT.gROOT.SetBatch(1)
myRootFile = ROOT.TFile("/home/henderson/scripts/OxfordTwiki/MyRootFile.root", 'r')
myRootTree = myRootFile.Get("physics")
myHistogram = ROOT.TH1D("MyHisto", "Histogram of Jet pT above with lepton trigger", 100, 0, 500 )
for event in myRootTree:
if not event.EF_e24vhi_medium1: continue
for jet in xrange(len(event.jet_AntiKt4LCTopo_pt)):
myHistogram.Fill( event.jet_AntiKt4LCTopo_pt[ jet ] * 0.001 )
pass
pass
c1 = ROOT.TCanvas()
c1.SetLogy()
myHistogram.Draw("hist")
c1.Print("MyHist.pdf")
Modify the permissions and make executable:
chmod +x MyScript.py
./MyScript.py
Or simply run in an interactive python shell by typing "python" and copying/pasting the above into the terminal (one line at a time is required for the loop). You can exit python by pressing 'control' and 'd' together.
C++
C++ is the real work-horse language of ATLAS with the vast majority of code written in it. It is fast, efficient and well documented, although not always as intuitive as python. If you have never worked with object-orientated code before then do some reading on the subject and try to design your code around classes as it dramatically increases readability for others.
ROOT is the standard ATLAS/CERN tool for histogramming and storing data. It is a collection of C++ classes such as TTrees and TFiles that allow the user to manipulate the data and make plots. There are many posts on the ROOT forum where you can ask the developers questions or check where others may have had similar issues. Beware of memory leaks especially in PyROOT - I have found that the memory management systems of python and ROOT fight and sometimes objects become deleted when you don't expect it. Try the following line if this happens:
ROOT.SetOwnership( <OBJECT NAME>, False )
xAOD Analysis
Jon: ATLAS will be using a new data storage method in Run 2 called xAODs. This comes with its own EDM which is integrated into RootCore's EventLoop package. I've been having a look at how to set up an analysis to run over an xAOD and I'll put any things I've discovered during this here - any questions or comments please let me know. I don't propose to summarise the xAOD tutorial here as it is fairly clear but if there is demand for that I can.
Useful xAOD links
Here are the main links that you'll need for reference when setting up an xAOD analysis:
xAOD Software Tutorial,
EventLoop twiki page,
Offline software tutorial indico page
If you come across issues with EDM code doing something strange or if you need to check what you can do with a particular xAOD object then you can find their code in svn.
Common source of memory leaks
Root is notoriously bad for memory leaks and EventLoop is no exception. This can produce some very annoying problems as the effects aren't necessarily noticeable on smaller test samples but can break your job on a larger data set. The worst of these that I've come across is connected to making shallow copies of xAOD containers. As described in the tutorials when you create a shallow copy of a container you retain ownership of this object so it must be deleted. The best way to do this is to record the created containers into a TStore object. Most of the time you will want to create, process and delete these objects in each execute() loop - in this case there is an appropriate object that EventLoop creates for you accessible by
xAOD::TStore* store = wk()->xaodStore();
You can then record a container to the store using
This store is cleared automatically for you after each iteration of the execute() loop. Not deleting a shallow copy you've made is quite a simple mistake to spot - the danger is that a lot of xAOD tools (e.g. SUSYTools, JetCalibTools, etc) will create shallow copies without telling you. It is still your responsibility to delete these. A good policy to follow seems to be that any pointer you declare should be recorded to either an output stream or this TStore (unless you have good reason not to...). A further comment is that unless you have a very good reason xAOD objects should be collected into the relevant containers (e.g. xAOD::Jet objects should be stored in an xAOD::JetContainer object). This helps ensure that they're correctly removed from memory.
NB: Using the TStore will probably require you to include
#include"xAODRootAccess/TStore.h";
somewhere in your code.
Missing ET
As the MET term has to be rebuilt it isn't as simple to retrieve as many of the other objects in the xAOD. Typically it will require the use of a specialised tool. There is a set of slides in the offline tutorial. Many analysis tools will contain a MET rebuilder tool (for example the SUSYTools package contains such a tool).
Setting up your work environment
Oxford has a great IT website of helpful advice and links (printing, email etc.) that can be found here.
There are two real options for computing where you can work:
1 - The Oxford system has interactive machines (called pplxint8 and pplxint9) which run scientific linux 6 (SLC6). There are two older machines (pplxint5/6) which run SLC5 but you should not need these. These machines have many processors and a large amount of RAM to test all your memory leaks on. You can log in from any linux terminal using
ssh -X OXFORD_USERNAME@pplxint8.physics.ox.ac.uk
The "-X" sets up X11 which means you can view graphics if required rather than simply the test on the terminal. There is also a batch system which is explained below.
2 - LXPLUS is the CERN central SLC6 computing service. All CERN users are able to access this service so it can be a great way to run code that someone else has written in exactly the same environment as them or exchange files with people. Again, you can log in from any terminal with
ssh -X CERN_USERNAME@lxplus.cern.ch
I find it to be quite slow so I would recommend the Oxford system over lxplus.
The centrally maintained "ATLAS Software Tutorial" is a fantastic place to learn about the tools needed for analysis Link and I strongly recommend attending the tutorial week as soon as possible as it becomes hard to change your code after you have started to develop a framework.
NEW June 2014 The standard ATLAS tool for Run 2 analysis is planned to be the RootCore package AnalysisBase. The plan is that this will apply calibrations to all the physics objects from the combined performance (CP) tools so that the user does not have to worry about vastly over complicated tools (looking at you JetETMiss). Check out the tutorial here.
If you are working on lxplus, you can increase your storage space by going to here and clicking "increase quota".
Registering with CERN and ATLAS
If you arrived from a different institution and/or a different research group, you'll need to register with either CERN or ATLAS. The latter is especially useful, since many of the reading materials and software packages simply won't be available unless you have registered. The link below provides useful information for various registration scenarios: http://atlassec.web.cern.ch/atlassec/Registration.htm. The general procedure for two scenarios follows two steps in 'I plan to come to CERN and work for ATLAS!':
For most 1st year PhD students (who aren't physically working at CERN yet), you only need to email (or fax) ATLAS secretariat with your ATLAS registration form filled and signed by either Tony Weidberg or Ian Shipsey. If you are registering for the first time, also attach a copy of your id/passport in the email. (Step 2)
If you are starting your work on site, talk to Tony Weidberg to start your PRT. Shortly after you'll receive an email with link to your registration form. Fill it in, and ask Sue Geddes for a signed copy of home institute declaration form. (Step 1)
Useful CVMFS addresses
It can be useful to install code locally but if you are in a rush or if you like to know everything was compiled by an expert, you can use the cvmfs installations of most programs. This allows you to use lots of code versions which are pre-compiled on your architecture so please choose wisely. The addresses below will probably be out of date but should give you an idea of where to look.
Oxford does have versions of most software installed locally but working with on the same code analysers from many institutions means it is beneficial to have everyone working from the same setup and using the same versions. There are some easy commands to setup the necessary tools.
This loads the commands that are needed to setup the tools you may wish to use from a central CERN-based script. Cvmfs (Cern Virtual Machine File System) Link is available at Oxford but links to servers at CERN just as the "/afs/cern.ch/" file system does. For example, if you wanted to setup ROOT, just type
localSetupROOT
after the 'source' line above. You can specify a specific version of ROOT or gcc (c compiler) if you wish (check out all the options by typing "localSetupROOT --help") but the command will try to automatically pick one that works for your current environment. ROOT will now be setup for your terminal.
There are many more tools you can load in the same manner, type "localSetup" and hit "tab" twice to see all the options.
ATHENA
Athena is the official ATLAS environment - although I have always found overly complicated and avoid it as much as possible. You need to use it to generate Monte-Carlo events in the official ATLAS manner or to use certain ATLAS tools. Check out this link for some top tips (more for developing athena than running). It can be setup using:
The long number "17.2..." defines the Athena release you wish to setup. There is a list of all the available releases here and change-log documentation. For pplxint8 or 9 we have to add the ",slc5" as Athena does not like the SLC6 environment just yet and the "--64" forces the release to be in 64 bit mode.
You can now run commands like "Generate_try.py" or "Reco_trf.py" interactively. This is the procedure for testing generation of events before sending jobs to the grid.
Grid Certificate/ATLAS VO Membership
This can be a nightmare if you don't have a guide, please read through all the below before starting and if you have any problems just email/ask someone as we have all had to go through this right of passage at some point.
The grid allows massively parallel jobs to be run on many sites around the world and can be an extremely powerful tool if you learn to use it. To submit jobs or download datasets you must have a valid grid certificate.
Certificate
To apply/renew your certificate, use the UK e-Science CertWizard.
This is a relatively simple java wizard to guide you through the application. You are requested to enter at least 2 different passwords throughout this process - personally (despite it being bad password practice) I recommend using the same password at all points here because it is often not clear which password is being requested.
Once your certificate has been approved (it usually takes a day or so), you can export it from the CertWizard (as a .p12 file) and import into you internet browser (Firefox: Edit -> Preferences -> Advanced -> Certificates -> Import) or move it to pplxint/lxplus. The pplxint procedure is out line below:
ssh USERNAME@pplxint8.physics.ox.ac.uk
mkdir ~/.globus
cd ~/.globus
< Copy the .p12 file into the globus directory >
openssl pkcs12 -in <YOUR CERT>.p12 -clcerts -nokeys -out $HOME/.globus/usercert.pem
openssl pkcs12 -in <YOUR CERT>.p12 -nocerts -out $HOME/.globus/userkey.pem
chmod 400 userkey.pem
chmod 444 usercert.pem
You should now be able to log into the grid setup and submit jobs / download datasets. Test with:
You will also need to be a member of the ATLAS Virtual Organisation (VO), click here for details. During the application you must selected a representative - nobody know why or who these people are - just try a few until one is accepted.
I have not found a centralised walkthrough for the UK for this step but use the below links to understand the steps in ATLAS VO registration.
There are 3 registration phases
Phase 1 -> Becoming a candidate - simple, just enter name and email
Phase 2 -> Becoming an applicant - choose which Roles you wish to have, just choose:
/atlas
/atlas/uk
/astlas/lcg1
The other options are for software managers only
Phase 3 -> Becoming a member - Just wait for an email and tick a box
Helpful VO Website
Link
ATLAS France walkthrough (don't sign up for /atlas/fr)
The whole procedure can take a week to setup as you first have to apply for the certificate and then apply for the VO membership with the certificate before you can do anything.
Checking out software
Many software packages in ATLAS are available for download using either 'svn' or 'cmt'. Svn is a general purpose version control coda that allows users to store code remotely and download specific versions or tags. Cmt Link (Configuration Management Tool) is a similar utility but is more focused on package based codes. On the interactive pplxint Oxford systems your user name may not match your CERN user name and this can cause issues when trying to download using both these tools. This can be overcome using the below commands. Your user name on lxplus should always match your CERN user name so you should not need any changes if working on lxplus.
You can check your current user name with
echo $USER
If this matches your CERN user name then you should not have a problem, otherwise follow the steps below.
Say you wish to check out (download) the awesome ATLAS VBF W framework, simply use:
svn co svn+ssh://CERN_USERNAME@svn.cern.ch/reps/atlasphys/Physics/StandardModel/ElectroWeak/Development/VBFW/Software/Framework/trunk
And replace CERN_USERNAME with your CERN user name. If your current user name matches your CERN user name then the "CERN_USERNAME@" can be omitted.
USEFUL To avoid having to repeat entering your password a great many times when using svn, check out the advice at this link. The main point being you can simply enter your password once per terminal session using:
kinit CERN_USERNAME@CERN.CH
where the capitalisation of "CERN.CH" matters.
CMT user name change
Thanks to Todd Huffman for battling through this procedure and finding out this information.
In this case we must alter a log file but this only has to be done once. Open (or create) a text file called "~/.ssh/config" and make sure it looks like this:
host *.cern.ch
user <cern user name> (in my case this would be thuffman)
ForwardX11 yes
GSSAPITrustDns yes
CheckHostIP no
GSSAPIDelegateCredentials yes
I think that it is the first two lines that are key, the rest is just to make Kerberos sharing and ssh work more transparently between Oxford and CERN.
Kinit <cernusername>@CERN.CH
aklog
Now go to your working directory on pplxint and setup Athena as above. We must now change the environment variable "SVNUSR", on a bash shell (echo $SHELL to test) type
Now you can download your ATHENA packages to pplxint, this next will get the ATHENA helloworld basic user analysis package
cmt co –r UserAnalysis-00-15-12 PhysicsAnalysis/AnalysisCommon/UserAnalysis
At this point you can drill down into the "PhysicsAnalysis/AnalysisCommon/UserAnalysis/cmt" directory and type "gmake" and it should actually compile!
Once compiled you do
You can setup the system to not require your password to be entered every time you wish to check out/in a package. See this link for details Link.
Data Disk Organisation
UPDATE In July/August 2014 all data disk files were transferred to a new faster server but I believe the user interface was maintained so the below is still correct.
For historical reasons there are 3 Oxford ATLAS data disks located at:
But they are all identical to the user and you can simply pick your favorite number and make a directory for yourself on there (I chose /data/atlas/atlasdata/henderson/).
We have a central directory where we try to store all the large datasets that are downloaded in order to try to avoid duplication of datasets across different analysis groups. Please download your datasets to
/data/atlas/atlasdata3/D3PD/
And make them readable by everyone ("chmod -R a+r MyNewD3PDFolder/").
ATLAS Formats
Due to the large number of formats ATLAS data or MC goes through it can be confusing exactly what sort of data you are looking at. An incredibly useful but little known about twiki is the Explanation of SMWZ branches twiki - it is by no way complete but the best reference I have found at documenting what the branch names actually mean. An overview of the event data model for ATLAS can be found here.
Monte Carlo
ATLAS Monte Carlo is generated from event generators like Sherpa or Pythia as single events and then merged with minimum bias events to simulate pile-up. Minimum bias refers to the boring, low-pT events that are the most common events to see in ATLAS so there is no bias put on the selection (no trigger). The table below shows the progression of the MC files before we can access them. The main transformation (ATLAS term for moving one data format to another) is known as "Reco_trf" or "RecoTf", details can be found here.
MC Format Name
Notes
Evgen or EVNT
Output of "Generate_try.py", named as "X.pool.root", pure Monte Carlo - detector independent, can access all of unshowered, parton, hadron (/particle) level
Hits or simul
GEANT4 used to simulate ATLAS detector, voltages output
Digi
Digitisation of the Hits data
RECO
Reconstruction of the voltages into physics loose objects
ESD
Event Summary Data - not usually permanently stored
AOD
Analysis Object Data - Hits reconstructed into objects and collections (jets, electrons etc), simplified calibrations applied
Derived Physics Data/N-Tuple - Slimmed and re-organised AODs - used for most physics analyses
The various steps to making a D3PD are done in as parallel a way as possible so you will find many datasets with "merge" in the name where multiple files have been added together. You can use a few website to check on the status of MC production:
You can request a D3PD to be produced from an AOD via the relevant email lists, this page shows the status of D3PD production, change the p1328 tag as needed
You may wish to identify exactly which version of a MC generator created a sample you are using (eg Sherpa 1.4.1 or 1.4.3??).
NEW Check out this great page by Jose Garcia to find which generator versions are available in which athena versions.
This is usually done by searching AMI (see above) for the EVNT sample that your D3PD/AOD derives from. E.g. if I wanted to find out which Sherpa version was used for mc12_8TeV.129915.Sherpa_CT10_Wenu2JetsEW1JetQCD15GeV_min_n_tchannels.NTUP_SMWZ... I would search AMI for "mc12_8TeV 129915 EVNT" and click on "details".
This being ATLAS, sometimes the above method does not work. In this case, follow the above and find the Athena tag that the EVNT file was created with and then follow the below advice from MC guru Claire Gwenlan:
Set up the release:
asetup <ATHENA VERSION FOUND IN EVNT AMI PAGE>
and do:
cmt show versions External/<MC GENERATOR>
which will give you the generator tag. E.g. for the 129915 sample above:
in this case, the version is 1.4.1
We can tell this just from the name since Sherpa tags are nicely numbered according to the version. Some of our External packages are not numbered so nicely and in this case you'd have to look in svn, and actually look at the requirements file associated with that tag, to be sure which version:
https://svnweb.cern.ch/trac/atlasoff/browser/External/Sherpa/tags/Sherpa-01-04-01-00/cmt/requirements
Data formats
Data goes through similar transformation steps as the above.
The Oxford batch system allows users to send high intensity jobs to dedicated cores where they can run without interruption from interactive issues and consume resources without annoying other users. There are 3 main quesues which differ depending on the allowed time for a job to complete, these are called "normal", "short" (12 hours) and "veryshort" (2 hours), although names differ on SLC5 (pplxint5 and 6) and SLC6 (pplxint8 and 9) machines. Submitting to a shorter time limit queue will increase the priority of your job. To submit to a particular queue, simply add the time limit you wish to the qsub command with the "-l cput=