Archive for July, 2006

From control files to experiment runs…

Thursday, July 20th, 2006

Here’s how a set of parameter sweeps will get translated into an actual experiment run…

  1. Generate a tree of Sweep objects from a control file and/or command-line arguments. (In the case of the GUI, the Sweep objects will be generated live as the model backing the view.)
  2. Get a list of parameter maps by calling generateMaps() on the top-level sweep.
  3. Convert that list of parameter maps into a list of jobs for the grid system, querying the plugin for the model system (Repast, Drone, etc.) along the way to generate the job submission data (and do things like stage files if there isn’t a shared filesystem).
  4. Submit the jobs to the grid.
  5. On the client/submit end, monitor progress of jobs using output from CLI tool or via GUI.
  6. On the agent end, run the jobs by passing the information to the plug-in. If required, stage files back to the FTP, etc. server when the job is done.

Sweeps

Thursday, July 20th, 2006

The basic model code for parameter sweeps is done. There’s a standard interface (Sweep) for all sweep types that contains a single method:

public List generateMaps()

The returned List is simply a sequence of parameter settings. Each item in the list is a ParameterMap object, which is just a subclass of HashMap with some convenience constructors.

Currently, there are six different concrete subclasses of Sweep, plus a few abstract subclasses defining common elements.

SingleValueSweep Pretty simple: assigns a single value to a single parameter.

ListSweep The first nontrivial type: assigns a list of values to a single parameter.

RangeListSweep Probably the most useful type: lets you assign a range of values, defined with a start, end, and increment, to a single parameter. The values are represented using the arbitrary-precision BigDecimal class, so there’s no possibility for rounding error when adding values together.

LinearCombinationSweep Combines two other sweeps “linearly”—that is, in parallel, so the first parameter map in sweep 1’s list gets combined with the first parameter map in sweep 2’s list, and so on. For example, combining beta=0.1,0.2,0.3 with gamma=0.4,0.5,0.6 would result in a length-3 LinearCombinationSweep with (beta,gamma)=(0.1,0.4), (0.2,0.5), (0.3,0.6).

MultiplicativeCombinationSweep This is what most people want when varying multiple parameters: generate every combination of each parameter/value pair. So, to reuse the last example, combining beta=0.1,0.2,0.3 with gamma=0.4,0.5,0.6 results in a length-9 MultiplicativeCombinationSweep with (beta,gamma) = (0.1,0.4), (0.1,0.5), (0.1,0.6), (0.2,0.4), (0.2,0.5), (0.2,0.6), (0.3,0.4), (0.3,0.5), (0.3,0.6).

UniformDoubleSweep The first in a series of stochastic sweeps (more to be written), this sweep generates a number (provided) of values uniformly distributed within a range, so a parameter space can be explored stochastically. If you’re exploring your parameter space from 0 to 1 in increments of 0.1, and it just so happens that interesting spikes happen at 0.15, 0.25, and 0.35, you’re not going to notice them unless you explore the space stochastically.

CLI usage scenarios

Wednesday, July 12th, 2006

The most important piece of unfinished business in the GridSweeper design is what exactly the command-line interface will look like. It’s funny—I grew up on an old-school Mac, scoffing at my primitive DOS-using fourth-grade schoolmates. What an awful way to interact with a computer: remember arcane commands and type them in! But as soon as you start doing software development, or system administration, or anything that needs to be automated, the command line is often more efficient.

I have the same goal for the GridSweeper command-line tools as for the graphical interface: make it easy to run parameter sweeps of models. More accurately, make the most common types of parameter sweeps very easy to do; and make other types of sweeps possible, and as easy as possible.

Scenario 1: Multiple Parameters, Ranges, All Combinations

The most common usage scenario is to vary one or more parameters, and run the model one or more times for each combination of parameters. So if there are 3 parameters being varied, each with 4 different values, and the model is being run 10 times with different random seeds, there will be total of 4 x 4 x 4 x 10 = 640 runs.

Let’s say a model has three parameters, beta, gamma, and nu. Beta will go from 0.3 to 0.6 in increments of 0.1; gamma from 1.0 to 1.3; and nu from 0.1 to 0.4. The model will be run 10 times with different random seeds. Let’s say the

The syntax will go something like this:

grepast mymodel -n10 beta=0.3:0.1:0.6 gamma=1.0:0.1:1.3 nu=0.1:0.1:0.4

A breakdown of the pieces:

  • grepast will be a tool that just calls “gridsweeper repast”, telling the gridsweeper tool that this is a repast model, so the parts of the process that need to be handled by the repast plug-in will be.
  • mymodel says to use mymodel.jar in the current directory. If there’s a shared filesystem (this will be settable in a configuration file or in the GUI), nothing will be transfered over the network except the complete path to the file; if FTP is being used, this file will be staged to the FTP server before running the job, and downloaded by the job on the execution machines.
  • beta=0.3:0.1:0.6 etc. are the key: you can specify ranges of values with super-simple syntax: [start]:[increment]:[end].

Scenario 2: Multiple Parameters, Specified Values, All Combinations

Sometimes you don’t want to specify ranges & increments, but simply particular combinations of values. You’ll be able to specify a vector of values using commas:

grepast mymodel -n10 beta=0.1,0.4,0.7 gamma=1.0,1.4,1.9

Or, if you want, you can mix range/increment lists with specific values:

grepast mymodel -n10 beta=0.1:0.1:0.5,0.7,1.3

Scenario 3: Multiple Parameters, Specific Combinations

Another common need is to run certain combinations of parameters, but not others. For example, beta=0.3/gamma=0.5 and beta=0.4/gamma=0.6, but not beta=0.3/gamma=0.6. This is accomplished by separating parameter names with semicolons (quotes inserted so the shell sees the whole thing as one argument):

grepast mymodel -n10 "beta;gamma = 0.3;0.5, 0.4;0.6"

If you’d rather specify lists of values with all beta values together and all gamma values together, that’s fine too—just remember that commas separate parameter values for a particular parameter; semicolons separate values for different parameters:

grepast mymodel -n10 "beta;gamma = 0.3,0.4; 0.5,0.6"

Extending this a step further, you’ll be able to combine range/increment lists with this syntax:

grepast mymodel -n10 "beta;gamma = 0.3:0.1:0.6; 0.6:0.1:0.9"

is equivalent to

grepast mymodel -n10 "beta;gamma = 0.3;0.6, 0.4;0.7, 0.5;0.8, 0.6;0.9"

and to

grepast mymodel -n10 "beta;gamma = 0.3,0.4,0.5,0.6; 0.6,0.7,0.8,0.9"

This is as much complexity as command-line syntax will support, though. Beyond this, it’s probably time to use a control file anyway (to be covered in a later post).

Java bindings working

Tuesday, July 11th, 2006

Thanks to Dan Templeton’s Java bindings in the Sun Grid Engine, and his porting instructions on his blog, I have XgridDRMAA basically working in Java. I’ll still have to look through for some minor implementation differences (such as supported attributes), but basic things, including the DrmaaExample.java code included with SGE, are working.

Besides fixing bugs I found along the way, I had to do a couple additional things to make things work right:

  • Change the class names This meant changing the package name in the source files, but also meant fixing a couple lines in the actual code where classes are looked up by name.
  • Find DRMAA Java implementation The org.ggf.drmaa SessionFactory class uses a couple methods to try to find a DRMAA implementation: first, it tries System.getProperty to see if a class name has been set; if not, it looks for a setting in the classpath’s META-INF/services/org.ggf.drmaa.SessionFactory file. If there, it uses that. I just added this file to the XgridDRMAA jar file.
  • Find DRMAA library (JNI) The DRMAA Java implementation basically just maps onto the JNI, which is compiled into the XgridDRMAA framework. On Mac OS X, JNI libraries are just Mach-O dylibs containing the right C code. Mac OS X frameworks are also simply Mach-O dylibs wrapped in a nice directory structure. So it’s just a matter of having the magic line of code (System.loadLibrary("drmaa")) find the right library. As it turns out, you have to symlink the XgridDRMAA executable to a file called libdrmaa.jnilib, add the enclosing directory to DYLD_LIBRARY_PATH, and everything works.
  • Fix exit-status analysis Apparently Darwin does exit-status values differently than whatever the SGE DRMAA code was written for—at first, the example code kept telling me that jobs were finishing “with unclear conditions.” I fixed JobInfoImpl.java to use the same semantics as Darwin’s wait.h file.

For the XgridDRMAA installer, I’ll just have it put symlinks in /Library/Java/Extensions to both XgridDRMAA.jar (so it’s in the classpath) and XgridDRMAA (as libdrmaa.jnilib, so it’s in the library path). That way, the user of Java DRMAA apps won’t have to do any additional work (besides setting things up in the prefpane) to use Xgrid.

GridSweeper design overview

Friday, July 7th, 2006

Although XgridDRMAA has not quite stabilized yet, it’s time to move on to serious work on GridSweeper. (I’ll use it as a test suite for XgridDRMAA—I can run the code using Grid Engine’s DRMAA and XgridDRMAA, and problems in the latter will no doubt emerge.)

I spent a while at my whiteboard scrawling a mind map; here’s a simplified version in more legible form:

Development priorities for this software:

  1. Parameter control This is sort of the point: converting compact representations of parameter combinations into big long lists of parameters settings to be run.
  2. Plug-in interface This is how parameter settings get translated into control parameters for specific classes of models—e.g., Repast models, general command-line parameters, etc.
  3. Grid control This is the other part of the point: submitting lists of parameter settings to the grid. Very straightforward, thanks to DRMAA.
  4. CLI I need some way of interacting with the system (aside from writing new main() methods) as early as possible.
  5. Preferences Good to be able to save settings to shortcut things for both the CLI and the GUI—e.g.,
  6. File transfer interface Unfortunately, you can’t count on having a shared filesystem. (In fact, I don’t have a shared filesystem for my “grid” of two Macs.) So you need a way to transfer output files that aren’t stdin/out/err (which is provided for by DRMAA). I think the simplest solution is to just support FTP servers, my previous ramblings about having a custom file-transfer daemon notwithstanding. Most bang for my coding time, thanks to the Jakarta Commons Net FTP library.
  7. GUI This is the most open-ended component, so I’ll leave it for the end, and it can be as sophisticated or as simple as I have time for.

C bindings complete

Thursday, July 6th, 2006

The Objective-C DRMAA implementation has now been wrapped in C as per the 1.0 DRMAA C binding spec. Far more code than expected, but all very straightforward code.

All that’s left is some real testing of the C layer, filling in a couple holes in the Objective-C code (most notably supporting file transfer via scp from other hosts), and doing the Java bindings, which will consist essentially of code lifted from the Sun Open Source Grid Engine code base.