SCITAS Cluster Processing
We followed the intro SCITAS Course to connect to the Clusters
Documentation is here:
https://scitas-data.epfl.ch/public/training/using_the_clusters.pdf
https://scitas-data.epfl.ch/kb/File+systems to get some information on FIleSystem
Useful UNIX commands
- ls list content of directory
- id returns all groups you belong to
- pwd displays current directory
- mkdir makes a folder
- wget download a web page
- cat displays the content of a file in the terminal
- curl (curl www.perdu.com) write the content of a web page ?
- man get help on a linux command man wget for instance
- cd ~ goes to home directory
- nano filename reasonably easy to use text editor
- echo $SCRATCH print env variable
To execute a file:
- Add a point before the file : . (but why not required if using an absolute path ?)
UNIX commands you can forget
- vi not so modern torture method based on a text editor
Cluster Commands
- sacct retrieve info on recent job launches
- squeue displays cluster queue (all users jobs)
- Squeue displays cluster queue (your jobs)
- module purge
- module load intel
- module spider displays lio
- module show displays loaded modules
- scontrol -dd show job
Structure of cluster
/home/username your personal area, shared among all 3 clusters
/work/ptbiop where biop software are installed, shared among all 3 clusters
/scratch/username is a high performance file system where we put the data we need to work on, it is not shared and depends on the cluster you are connected to
How to connect
Scitas documentation
There are 3 clusters at EPFL named deneb1, deneb2, fidis.
- Linux: Type ssh gasparlogin@clustername.epfl.ch and give your password. For instance if Nico wants to connect to the deneb1 cluster (there's also deneb2 and fidis) : ssh chiarutt@deneb1.epfl.ch.
- Windows:
- On windows 10, there's a ssh client: so execute cmd.exe and just type ssh gasparlogin@deneb1.epfl.ch in your windows terminal
- Alternative method: install and use PuttY
How to transfer data into the server
Terminal command line
It can be useful to transfer directly from the internet :
- From the internet : wget http://linkToMyFile.what for instance to download ilastik: wget http://files.ilastik.org/ilastik-1.3.0-Linux.tar.bz2
- From a windows shared server (svraw1): smbclient '//svraw1.epfl.ch/ptbiop/c$' -c 'lcd /home/chiarutt/examples; cd public; get test.txt' -U intranet/chiarutt. This rather unintuitive command copies the file //svraw1.epfl.ch/biop/public/test.txt to the folder /home/chiarutt/examples. You'll need to enter your gaspar password. Type cat test.txt in the correct folder to see that you've indeed downloaded this file.
From your local computer with a Graphical User Interface
- Windows users: This can be done via FileZilla, but take care, this soft now installs unwanted software during the installation.
- Linux users: in nautilus, go to ssh://gasparlogin@deneb1.epfl.ch
Where
For temporary jobs, put to \scratch\gasparlogin. For instance Nico can type cd \scratch\chiarutt then ls to list the contents of this folder within a terminal.
Software available
In the work folder. Fiji is available. As several FIJI may be required depending on the configuration, the first one, which do not have any plugin installed, is the default one.
How to install new software
You can install either software (unix executable / java applications) into your folder or in the shared /work/ptbiop/ folder. If you cannot execute it, be sure that the file is executable. For instance for Fiji, type chmod +x ImageJ-linux64.
DEFAULT FIJI
No update site installed. Just a raw updated fiji.
- Location folder: /work/ptbiop/DefaultFiji/Fiji.app/
- Fiji can be updated using command lines:
- Update FIJI using this command: /work/ptbiop/DefaultFiji/Fiji.app/ImageJ-linux64 --update update
Some sparse documentation can be found on the ImageJ.net website:
FIJI0
Fiji0 is the fiji where update sites can be installed.
- Location folder: /work/ptbiop/Fiji0/Fiji.app/
- Fiji can be updated using command lines:
- Update FIJI using this command: /work/ptbiop/Fiji0/Fiji.app/ImageJ-linux64 --update update
- To add an update site, see documentation
Ilaslik
Ilastik is installed.
- Location folder: /work/ptbiop/Ilastik/ilastik-1.3.0-Linux/
CellProfiler
TODO, currently not installed.
Testing simple tasks on the cluster
By default you are connected to the so-called login node, which is not a processing node. You can browse, copy files, do your stuff, even launch ImageJ / Ilastik / Whateve(r) , but this is not where you are supposed to do heavy processing!
Described below:
- how to launch process in the login node (in order to understand a bit about unix commands)
- how to launch a JOB, i.e. which is calling a request to launch a process. Look at the Scitas documentation for detailed information
Example simple Job on the CLUSTER
We will launch a job on the cluster. To do this, we need to encapsulate the command into a file which gives informations about how the job(s) should be launched and with which hardware. A simple file like this is given here:
cat /work/ptbiop/jobs/samples/simplejob.sh
Nano is used to display the file in the terminal. This job simply displays the name of the node it will be executed on.
- To launch this job, type:
- sbatch /work/ptbiop/jobs/samples/simplejob.sh
- To know the current status of the job:
- Squeue
- To retrieve info on recent job launches:
- sacct
- When the job is done, a file will appear in your current folder or in the working dir if it has been specified in the batch file by adding a line similar to #SBATCH --workdir /scratch/chiarutt. This file is of the form slurm-jobid.out and you can type cat slurm-jobid.out, which will display something similar to hello from r06-node26.
Congrats! You've launched a job on the cluster!
Example Ilastik Launch - Login NODE
Headless ilastik operation documentation can be found here: http://ilastik.org/documentation/basics/headless.html
Launching ilastik, just for useless fun:
/work/ptbiop/Ilastik/ilastik-1.3.0-Linux/run_ilastik.sh --headless
Launching ilastik for a classification task, on the login node. There is an example project file on /work/ptbiop/sampledata/ilastik/ilps/MyProject.ilp which contains a classifier, and a sample image on /work/ptbiop/sampledata/ilastik/image/Vesicles.tif. To classify this image with the classifier contained in the project, type:
/work/ptbiop/Ilastik/ilastik-1.3.0-Linux/run_ilastik.sh --headless --project=/work/ptbiop/sampledata/ilastik/ilps/MyProject.ilp /work/ptbiop/sampledata/ilastik/image/Vesicles.tif
A file named Vesicles_probabilities.h5 should have appear in the folder `/work/ptbiop/sampledata/ilastik/image/. You can delete this probabilities file, to be sure...
Example Ilastik Job on the CLUSTER
sbatch /work/ptbiop/jobs/samples/ilastik.sh
Check with Squeue and sacct.
Example Fiji Launch - Login NODE
To launch Fiji, just type:
/work/ptbiop/DefaultFiji/Fiji.app/ImageJ-linux64 --ij2 --headless
This launches a Fiji instance, which is doing nothing special.
There is the possibility to run a simple script. For instance go to /work/ptbiop/samplescripts/ij/ ant type nano hello.py. This script has one parameter and simply greets the user by its name. Type 'Ctrl+X' to exit nano, and then type:
/work/ptbiop/DefaultFiji/Fiji.app/ImageJ-linux64 --ij2 --headless --run /work/ptbiop/samplescripts/ij/hello.py 'name="Patrick"'
You've said Hello to Patrick with Fiji on the cluster. That's quite an achievement. Congrats!
However this was done on a login node node.
Example ImageJ Job on the CLUSTER
sbatch /work/ptbiop/jobs/samples/fiji.sh
Check with Squeue and sacct.
Example CellProfiler Launch
TODO
- Last Author
- oburri
- Last Edited
- Oct 26 2021, 10:35