RGANG

Readme File  |  Software  |  Documentation

Abstract
RGANG is a tool which allows one to execute commands on or distribute files to many nodes (computers). It incorporates an algorithm to build a tree-like structure (or "worm" structure) to allow the distribution processing time to scale very well to 1000 or more nodes.

Because the original "RGANG" executes the commands on the specified nodes serially, execution time was proportional to the number of nodes. A parallel version of "RGANG" has been implemented in Python. This version forks separate rsh/ssh children, which execute in parallel. After successfully waiting on returns from each child or after timing out, this latest version of RGANG displays the node responses in identical fashion to the original "shell" version of RGANG. In addition, the latest RGANG returns the OR of all of the exit status values of the commands executed on each of the nodes. Simple commands can execute via this RGANG on an 80 node cluster in about 3 seconds.

To allow scaling to kiloclusters, the new RGANG can utilize a tree-structure, via an "nway" switch. When so invoked, RGANG uses rsh/ssh to spawn copies of itself on multiple nodes. These copies in turn spawn additional copies.

last modified 07/01/2001   rgang-support@fnal.gov
Security, Privacy, LegalFermi National Accelerator Laboratory