Archive for June, 2007

Preliminary Javadoc completed

Monday, June 25th, 2007

Between driving from San Francisco and selling furniture on Craigslist, this weekend I wrote preliminary Javadoc for all of last summer’s GridSweeper work. A very valuable exercise before diving into coding: it made me look through every single method I wrote and say something about it. It also brought a number of design flaws to my attention, duly noted in TODO comments.

High-level GridSweeper execution overview

Thursday, June 21st, 2007

The purpose of GridSweeper is to take a simple user-provided description of what parameter settings to run a model with, run the model on a grid, and return results to the user.

The user will be able to manipulate the parameter-sweep description in three ways: (1) using an XML specification file, (2) with command-line arguments, and (3) with a graphical user interface. These three mechanisms can be mixed: command-line arguments can augment or override XML as well as be saved back out to XML, and the GUI tool will serve to edit and save XML files as well.

Ultimately, user action will result in running the GridSweeper program, which turns parameter sweep specifications into job specifications for the grid system via DRMAA. Specifically, the program does the following:

  1. Parses the XML specification and command-line arguments to generate an Experiment object.
  2. Generates a list of ExperimentCase objects (parameter value settings) from the Experiment class.
  3. Sets up an output directory for the files generated by this experiment. If a shared filesystem is not present, this can be done via FTP or other file-transfer mechanism supported by a plugin implementing a subclass of FileTransferSystem.
  4. Starts a DRMAA session and submits a job for each experiment case, using an archived RunSetup object for each job’s standard input.
  5. Still unimplemented: monitors the results of jobs and reports status changes to the user.

The way things are set up now, GridSweeper requires support on both the submission end and the execution end of the grid. The DRMAA job specification specifies that the execution host run not the model itself, but the GridSweeperRunner program, which takes input data and uses that to actually run the model. Specifically, it does the following:

  1. Unarchives the RunSetup object from standard input.
  2. If necessary, downloads input files via the file transfer mechanism.
  3. Actually runs the model using an instance of the Adapter class specified by the user (explicitly, or implicitly by using, e.g., gdrone for the Drone compatibility adapter). The Adapter object knows how to take a set of parameters and send it to a particular type of model executable.
  4. If necessary, uploads output files via the file transfer mechanism.

One problem with this mechanism is that it submits a separate job for every experiment case, bypassing DRMAA’s notion of batch jobs. DRMAA batch jobs let you submit a whole bunch of jobs at the same time by specifying that each job is the same except for an integer specifier, and that specifier can be used as a variable in command-line arguments. Because some systems may be faster at accepting batch jobs than a pile of individual jobs, it might be worth using the batch job mechanism.

One way to do this would be to defer the calculation of parameter assignments and random seeds to the execution host, but that makes it impossible to generate a file for reproducing the experiment as soon as it is submitted. A better way is to generate a series of input files in the experiment directory, named with the batch run index, and have the GridSweeperRunner tool read those files at runtime rather than reading an object from standard input.

GridSweeper installation hierarchy

Thursday, June 21st, 2007

As currently conceived, GridSweeper will consist of a set of Java classes in JAR files, additional Java classes as plugins (plugin format to be determined, but will include a JAR file), and shell scripts to simplify this:

java -cp ${GRIDSWEEPER_ROOT}/classes/GridSweeper.jar \\
    com.edbaskerville.gridsweeper.GridSweeper [args]

into this:

gsweep [args]

The top level of the hierarchy will be designated by the environment variable $GRIDSWEEPER_ROOT, within which the following tree will exist:

$GRIDSWEEPER_ROOT/
    bin/
        gsweep
            (main GridSweeper submission executable)
        gdrone
            (shortcut to gsweep -a com.edbaskerville.gridsweeper.DroneAdapter)
        grunner
            (wrapper to actually execute jobs on the agent machine)
        ...
            (other scripts to shortcut, e.g., the Repast adapter)
    classes/
        classes.jar
            (all classes except those with main methods)
        GridSweeper.jar
            (app/tool class)
        GridSweeperRunner.jar
            (class to actually run simulations on agents)
    plugins/
        (contains add-on adapters and file-transfer systems)

Setting up the GridSweeper build environment

Thursday, June 21st, 2007

First things first: this post covers how to get the GridSweeper build environment set up on your machine. I’ve developed GridSweeper entirely with Eclipse, but the build process uses Ant, so it can be run from the command line as well (or, theoretically, any other Ant-compatible IDE).

To get GridSweeper building on your machine, you’ll need to get threefour things:

  1. The code distribution (trunk), checked out into your Eclipse workspace. Soon to be hosted at CSCS.
  2. An implementation of the Java Distributed Resource Management Application API (DRMAA). For CSCS/Linux, you should use the one provided by the Sun Grid Engine (in /appl/sge/drmaa.jar on CSCS machines). For building on my Mac, I’m using my XgridDRMAA implementation.
  3. Jakarta Commons Net (download page). This is for FTP file transfer, which won’t actually be relevant for CSCS—maybe I can modify the build system to make this optional.
  4. Jakarta ORO (download page), also for FTP. You won’t even realize you’re missing this until you get an obscure class not found error at runtime when using any of the FTP directory methods.

If you’re using Eclipse (recommended), open up the project in your workspace. Add the DRMAA and Jakarta Commons Net jar files (Project > Properties > Java Build Path > Add External JARs…), and, in theory, the project should build.

Next: evaluating the code.

GridSweeper code review

Thursday, June 21st, 2007

I’m beginning this year’s GridSweeper development by picking up all the pieces from last summer, documenting them, and considering design changes, then getting down to the business of finishing up the implementation. The next few posts will cover what I find.

Once there’s a wiki hosted by CSCS, I’ll organize this information there.

Report on git

Tuesday, June 19th, 2007

After seeing this video of Linus Torvalds waxing arrogant about git, the source code management system he wrote for Linux kernel development, I decided I should try it out.

The benefits of git over the CVS/Subversion model amount to this: (1) git is distributed, and (2) git is fast. (2) is partly a result of (1), partly a result of Linus being a wizard programmer.

For git, “distributed” means that every working directory is a full-fledged repository in its own right. That is, there is no single repository that all developers check out of and commit into. Commits are done locally, into your own copy of the repository. Sharing is done only explicitly, so nobody can screw up your personal copy of the code with their code, and vice versa. Distribution also gets you a bunch of speed, because commits are simply written into your local filesystem. Beyond that, Linus used a lot of fancy impressive algorithms to make things generally fast.

I used Fink to install git on my OS X machines (it’s in the unstable tree), and found that it works pretty much as advertised. The tutorial at kernel.org is quite nice; below is a quick summary of the most important things I gleaned from it.

Installation

On Mac OS X, you need to install Fink if you don’t have it. Then add unstable/main and unstable/crypto to /sw/etc/fink.conf, and then do

fink selfupdate; fink index; fink scanpackages

to activate the unstable trees.

Then do

fink install git

and type your admin password when it’s asked for.

On Linux, it’s probably just a matter of installing it from whatever package manager your distro has.

User Configuration

For the sake of the logs, you should first configure git to know your name and email address:

git config --global user.name "Ed Baskerville"
git config --global user.email notmyemailaddress@edbaskerville.com

Creating a Repository

To make a new repository from scratch and start working on it, you create the working directory and cd into it, create the repository inside, and then check out the empty main branch:

mkdir myproj
cd myproj
git-init
git-checkout

Notice I didn’t specify any arguments to init or checkout. Init assumes you want to create a repository called .git inside the current directory, and checkout assumes you want to check out code from .git into the current directory to begin working. So the standard directory structure is this:

myproj/ [the working directory]
    .git/ [the repository directory]
        ...
        [the git repository's internal state]
        ...
    ...
    [working source files]
    ...

Cloning a Repository

If there’s an existing repository that you want a copy of, you don’t check out a working copy as with CVS/Subversion. Rather, you clone it, making a copy of the complete repository for yourself. This works from a local file, via ssh, via http, or through git-daemon, which needs to be installed on the “server” machine. Examples:

git clone /local/path
git clone ssh://server.name/remote/path
git clone http://www.kernel.org/pub/scm/git/git.git
git clone git://git.sv.gnu.org/lilypond.git

Making Changes

Making and committing changes is pretty much just like CVS or Subversion: you edit or add files in your working directory, and then commit them, only your commits go into the local repository .git repository directory.

So you could create a file called myprogram.c:

echo I bet this won't compile > myprogram.c
git add myprogram.c
git commit

One catch is that you need to do a git add before a commit for all files that have changed, not just files that are new. You can get around this with the -a switch:

git commit -a

will first add any files that have changed to the commit (but not new files that have been created—they must be added explicitly) and then commit.

Pulling and Pushing

Source code management isn’t much use unless you can share your work. If somebody else has changes you want to merge into your copy of the repository, use git pull:

git-pull /path/to/repository

If you’re operating on a CVS/Subversion-like model with a central master repository, push can be used to put your changes into that repository:

git-push ssh://server.name/path/to/repository

Branches

I won’t spend time on branching, but one of the key reasons to use git is that branches are cheap and easy, so you can easily split off experimental branches to do things that might fail spectacularly. The tutorial has good information about all this.