Cluster 101

What is Cluster101?

Cluster101 is a a simple tool that allows you to run applications on many hosts. You can think of it as a front end for ssh ("ssh on steroids").

Requirements

You just need ssh access to the hosts where you want to run applications. No need to install any software on them (i.e. no need to bother the administrators).

WARNING: You should be able to access all the hosts without typing your password (typing your password hundreds of time is not fun). Here is an explanation on how to do it.

What is the typical usage of Cluster101?

You need to run hundreds of instances of a scientific application (e.g. a complex statistical algorithm with different sets of data / parameters). You have access to a lot of computers around your campus / company, but don't have administrator privileges to install clustering software.
You can use Cluster101 to run one application per host (or one per core) on those computers. Cluster101 will take care of distributing the load until the whole batch is finished.

What Cluster101 is NOT

This is NOT a fully featured clustering platform or development framework or process communication API. This is just a tool to run tasks on many hosts.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Features

  • Run a list of tasks on all hosts
  • Keep load balanced
  • Selectable number of instances per hosts (e.g. depending on number of cores)
  • Run a command once per host
  • Save stdout / stderr for each task
  • Monitor hosts for alarm conditions
  • Avoids running tasks on hosts with alarm conditions

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Download CVS

  • cvs -z3 -d:pserver:anonymous@cluster101.cvs.sourceforge.net:/cvsroot/cluster101 checkout -P Cluster101

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Configuration

Cluster101 will load a configuration file named "default.cluster101" on startup. You can load an alternate configuration by clicking on "File -> Load Configuration" menu.

The configuration file should look like this (for each host, you have to set the user name and maximum number of process you want to run):
Host : host-1.university.edu
    User : myUserName
    Max number of processes : 2
    Enabled : true
Host : host-2.university.edu
    User : myUserName
    Max number of processes : 2
    Enabled : false
...
...
...
Host : host-1499.university.edu
    User : myUserName
    Max number of processes : 8

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Task queue

The task queue is a list of the tasks you want to run. You just type the in a file, one task per on each line.
E.g. you want to run your algorithm with different parameters:

/home//mySuperSecretAlgotihm 0.1 
/home//mySuperSecretAlgotihm 0.2 
/home//mySuperSecretAlgotihm 0.3 
/home//mySuperSecretAlgotihm 0.6 
...
...
...
/home//mySuperSecretAlgotihm 9.9

Each line has to be any valid shell command or a semicolon separated list of commands (e.g. "date; sleep 1; date" is a valid line in a task list).

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Running stuff

Load your task list by selecting the "Task e -> Load queue" menu. Then run it by clicking on the "Task queue -> Run" menu. You can see the progress by clicking on the "Task queue" tab.

If you need to run one task on all the hosts (e.g. a script to install your software), you can do it by clicking on the "Task e -> Run 1 per host" menu. It will show you a dialog box where you type the command you want to run.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Screen Snapshot

This is a screen capture of the program running on a small cluster

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Author: Pablo Cingolani (pcingola@users.sourceforge.net)
Contributors: Felipe Barriga Richards (spam at felipebarriga.cl)
Key words (for search engines):
Cluster
ssh
simple cluster GUI