This TWiki is obsolete. Please use https://phat-srimanobhas.gitbook.io/cu-e-science/.

CU eScience cluster

About cluster

The CU eScience cluster is a part of the National eScience Infrastructure Consortium. Our cluster serves both CU and external users. Currently, we use Centos-7 as the main operating system for both frontend and worker nodes, and we use Slurm for job scheduler. Jobs can be run in interactive and batch modes. You can see examples of how to use the cluster in the section Instructions.

Become a user

Registration form can be found in the form.

Login to a cluster

To log in to the frontend (see detail below), the ssh client program can be used, and the user types Command Line Interface (CLI) commands to tell the computer what to do. The ssh client program is available on Linux, MacOS (Terminal) and MS Windows 10 machines (in PowerShell). For older Microsoft Windows machines, the PuTTY ssh client is recommended.

ssh your_user_name@escience0.sc.chula.ac.th
Note that escience0 is the load-balancing IP address. It should be fine to use for any job submissions. However, if you would like to compile your code with specific hardware, i.e. GPU, you should log in to a specific machine. We don't recommend using the login machines to run your code if not necessary, jobs will be killed if it consumes a lot of resources of the frontend nodes.
escience1.sc.chula.ac.th: small frontend machine, for job submission, monitoring only.
escience2.sc.chula.ac.th: small frontend machine, for job submission, monitoring only.
escience3.sc.chula.ac.th: frontend with Tesla G4 CPU
escience4.sc.chula.ac.th: high CPU and high memory machine

Our resources

CPU and Memory of nodes

Note. Hyperthreading (HT) is off by default.

Machine CPUs/node Memory (GB)/node No. of nodes Note
Frontend
Lenovo System X 3550 M5 20 (Intel Xeon CPU E5-2640 v4 2.40GHz) with HT on (40 threads) 32 1 escience1.sc.chula.ac.th
Lenovo System X 3550 M5 16 (Intel Xeon CPU E5-2620 v4 2.10GHz) 64 1 escience2.sc.chula.ac.th
Lenovo SR630 with 1x Tesla T4 GPU 32 (Intel Xeon Gold 5218 2.3GHz) 8 x 32GB TruDDR4 2933MHz 1 escience3.sc.chula.ac.th
Lenovo SR850 88 (Intel Xeon Gold 6152 2.10GHz 324 1 escience4.sc.chula.ac.th
Worker: Slurm
Lenovo SR630 with 1x Tesla T4 GPU 32 (Intel Xeon Gold 5218 2.3GHz) 8 x 32GB TruDDR4 2933MHz 7 HPC, HTC
Lenovo x3850x6 80 (Intel Xeon E7-8870v4 2.1 MHz) 512 1 HPC, HTC
IBM Blade H 16 (Intel Xeon 2.0 MHz) 32 5 HTC
IBM iDataPlex DX360M4 16 128 2 Mathematica
Worker: Kubernetes
Dell PowerEdge R740 - - 3 Department of Computer Engineering, CU
Lenovo SR630 with 1x Tesla T4 GPU 2 x Intel Xeon Gold 5218 16C 2.3GHz 8 x 32GB TruDDR4 2933MHz 2 -
Total 604 CPUs - - -

Storage

Currently, we use IBM Storwize 3700 with a total capacity 160 TiB after RAID6+SPARE. However, backing up your files is your responsibility.
File System Disk space limit Note
$Home 100 GB -
/work/project/quantum 20 TB for quantum group
/work/project/cms 50 TB for CMS/CERN group
/work/project/physics 20 TB for Physics CU staff and students


SLURM

Running Jobs by SLURM Script

See example and SLURM commands in https://thaisc.io/en/running-jobs-by-slurm-script/

QoS information

QoS for each user is defined when a user account is created. It should fit what you want to run.
QoS name Max nodes Max jobs Max CPU Max memory (GB) Max walltime limit (day-hh:mm:ss) Member Note
per user
cu_hpc 8 4 128 512 14-00:00:00 g_cu_hpc -
cu_htc 1 16 16 32 3-00:00:00 g_cu_htc -
cu_long 4 2 128 512 30-00:00:00 g_cu_hpc, g_cu_htc -
cu_student 2 4 16 64 7-00:00:00 g_cu_student -
escience 2 4 16 64 7-00:00:00 g_escience -
cu_math 1 2 16 120 30-00:00:00 g_cu_math For Mathematica

Compiler/Software

We currently provide basic compiler, software comes with Centos7. The provided software is installed in the /work/app/ directory. You can install your software under your home or project directory.

Name Version OS Note
Basic compiler, software
GCC 4.8.5 Centos7 -
Python 2.7.5 Centos7 -
3.6.8 Centos7 python3
python-matplotlib 1.2.0 Centos7 -
R 3.6.0 Centos7 -
Specific software
CMSSW 10_2, 10_6, 11_3, 12_0 SLC7 (Centos7) source /work/app/cms/cmsset_default.(c)sh
Geant4 - - In preparation.
Delphes 3.4.2 - In preparation.
Mathematica 10.4.1 Centos7 In preparation. Special permission is needed. Please contact CU site admin (DELETEsrimanob@DELETEmailNOSPAMPLEASE.cern.ch).
ROOT 6.22 Centos7 source /work/app/root/recent/bin/thisroot.(c)sh
No plan to install to the new cluster, please request
Abinit 7 Centos6 -
AutoDock 4.2 Centos6 -
bioperl   Centos6 -
Egglib 3.7 Centos6 Plan to install, need python 2.7 environment (scl enable python27 bash)
3.7 Centos7 Plan to install on the new pilot worknode
Elmer   Centos6 Plan to install
GROMACS 5 Centos6 -
4.9.3 Centos6 scl enable devtoolset-3 bash (or tcsh)
4.9.3 Centos7 Plan to install on the new pilot worknode
Madgraph 5 -
Quantum Espresso 5 Centos6 -

Instructions

2021-04-03: Obsolete information, to be updated.


Kubernetes


This page is updated on 2021-05-08 UPDATED -- PhatSrimanobhas - 2016-09-03

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng CUeSci2016.png r1 manage 269.9 K 2016-12-25 - 06:37 PhatSrimanobhas  
Edit | Attach | Watch | Print version | History: r23 < r22 < r21 < r20 < r19 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r23 - 2021-05-08 - PhatSrimanobhas
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback