GridSweeper Proposal for Google Summer of Code

If you haven’t heard of it, Summer of Code is a program started last year where Google pays students to work on open-source software projects. Last year, being out of school, I wasn’t eligible, but, now that I’m going back, I qualify.

I talked with Rick Riolo about getting UM CSCS signed up as a mentoring organization, so I might have the opportunity to work on some software relevant to the complex systems/agent-based modeling community. The most concrete, straightforward, and useful project we came up with was a general parameter sweep system that integrates with modern grid systems (Sun Grid Engine, Apple Xgrid, etc.).

Google denied the request for CSCS to be a mentoring organization (they used a rather odd first-come, first-served application process, and were out of spots), but it appears that I can still apply to Google as a mentoring organization and tell them I’d like someone else (Rick) to be the mentor.

Proposals from students can be submitted starting on May 1, so I have some time to think things through. Furthermore, I’d love any help from potential users out there in the ABM community.

I’ve written a draft outline of my proposal. I’d love your feedback on any and all of the following:

  • would you use this?
  • necessary features
  • existing solutions and their successes/shortcomings
  • comments on the proposal content/organization

Make hearty use of the Comments section!

7 Responses to “GridSweeper Proposal for Google Summer of Code”

  1. Andreas Haas Says:

    Marvelous proposal!

  2. tim Says:

    would be great if they fund you to do this. of value to lots of us (which you should mention in applications: genomics, and structural equation modeling spring to mind as users of slow algorithms, without an easy current route to parallelization.

    Should define “ABM” in the intro section at its first mention “agent-based models (ABM)”.

    “here I propose one to be funded by Google Summer of Code” what is one? )i know and you know, but maybe say “This Google Summer of Code proposal is to create and integrated ABM system able to run on standard grid systems” or whatever it is you really mean.

    Could the system be more generic than ABM? What about all the other algorithms, and statistics (maximum likelihood, optimisers etc.) that “run many times with different parameter settings”

  3. Ed Says:

    Hi Tim,

    Thanks for the comment! This can certainly be made more general. However, I’m not familiar with all these different kinds of models algorithms, so I’m not exactly sure what will be needed. Would it be enough for the system to be able to control arbitrary command-line parameter syntax? Does it need an API so that somebody can write an adapter to generate parameter configuration files?

    Whatever it takes, I’ll at least make it possible to extend the system to support other kinds of models, and reword the proposal in more general terms.

    I’ll define all terms—this is just an outline, not an actual proposal, so rest assured that any fragments with ambiguous antecedents will become real sentences. :)

    –Ed

  4. Ed Says:

    Some comments via email from Bill Rand, with responses…

    Bill: I was wondering if maybe it would help to at least talk about how the tool could be expanded beyond just agent-based modeling to computational models in general? Like mathematic models, matlab models, stella models, etc. These are all things where you need to throw large numbers of parameters to the models. However, the t/off is that it may make the tool useless since you have all sorts of paramater settings and a user would be better off with a generic grid tool.

    Me: I got similar feedback from someone else. I think I’ll make the proposal more general, but not write something that specifically knows about all these different systems. Perhaps: make it general, but provide an API for adding adapters for specific systems?

    Bill: I am a little confused by your statement that you may need to implement the “NetLogo controlling interface” NetLogo can be run from the command line in a “headless” option so I’m not sure why it would be any more difficult than RePast or Swarm to control.

    Me: Good to know: I saw Java API documentation for a “controlling interface” and assumed that was the only way.

    Bill: I think the API strategy for adding adapaters is a good way of making the system more flexible and gives you the ability to propose this as a more general purpose tool. For more info on running NetLogo from the command line look in the BehaviorSpace section of the documentation under “Advanced usage”.

  5. James Says:

    What is a “general parameter sweep system”?

  6. Ed Says:

    Hi James,

    Many computer models have a number of parameters that can be changed, and often a part of analyzing the model is looking at how system characteristics change with the parameters. For example, say I build a model of disease transmission on a network that has a parameter, beta, that determines how likely it is for an infected node to infect a neighboring node, and I’d want to run this model a bunch of times with different settings for beta, and (if this is a stochastic model) with different seeds for the random number generator.

    The idea here is to automate distributing a bunch of runs with different parameter settings over a grid. So you can just tell the software, run my disease model with beta from 0.1 to 0.9 in increments of 0.1, and run it 10 times for each parameter setting with a different random seed, and the software will ship it off to whatever grid system you’re using.

    –Ed

  7. Darren Schreiber Says:

    This should be quite useful and I think there is a substantial demand for it in the agent-based modeling world. The methods that I have used to do parameter sweeps in the past seemed slow and inefficient. And, they certainly did not leverage the available computational power laying around. The advantage of ABM tools like RePast, Swarm, MASON, etc for me is that the present solutions to the technical problems that allow me to focus on the substantive problems. Having a tool for doing easy parameter sweeps help to get at some of the validation issues that remain the critical vulnerability of ABM’s in my field.

    Best of luck!

Leave a Reply