Implementation notes
The last post was heavy on the theory of the Gillespie algorithm and not so much on the details of the software. So here’s what I’ve come up with so far.
Class hierarchy The root object for all objects created in EcoGillespie is Model. In a network model, Node is a subclass of Model. Network, in addition to containing a bunch of Node objects, is itself a subclass of Node, so Networks can be themselves linked together in a Network. In the disease model I’m implementing as a first example and to help guide the design of the system, the top-level model object is a DiseaseNetwork, and it contains a bunch of DiseaseNode objects.

Protecting the modeler from the simulation kernel A key design goal of this system is to ensure that all the things that exist in pretty much every model get taken care of in the simulation kernel and don’t have to be reinvented . Plus, there should be a simple set of conventions that ensure that the modeler doesn’t accidentally mess up this separation. A couple examples:
- The actual execution of the Gillespie algorithm will not be seen anywhere in a model’s code. Everything from calculating the next time to selecting the event to be executed will be handled by the simulation engine. A node will simply get called on to perform an event when appropriate.
- Random number generation: the model will never create random number generators itself, but will simply call a standard method,
getRNG(), to get the random number generator it should use. If I figure out a good way to parallelize this onto multiple threads, there will probably be a different RNG object per thread, andgetRNG()will return the correct one. But the modeler shouldn’t have to know that.
Convenience methods There are a lot of things that are done in most models that aren’t necessarily obvious things to implement, because they’re so simple. But I will implement as many of these things that come up directly in the library, because any place you can shift complexity from the application to the framework is a good thing for code reliability. Super-simple example: say you want something to happen with 40% probability. The standard way is to get a random double between 0.0 and 1.0. If it’s less than 0.4, do it; if greater than 0.4, don’t. Because it’s so common, it makes sense to have a method in the RNG library: boolean trueWithProbability(double probability).
Use classes, not interfaces Swarm, Repast, and the Java libraries themselves make heavy use of interfaces (protocols in Objective-C) to define how objects interact with the system. This is all well and good from a theoretical standpoint, because it separates the interface completely from the implementation, and makes it possible for the same object, by implementing several different interfaces, to theoretically be both a random number generator and an agent. In practice, however, I find that, when building a framework, you build it with the idea of having users of the framework use a lot of functionality you’re providing—not just the interfaces, but the implementations too! By having an interface and an standard base class implementation, you’re just unnecessarily cluttering the library and confusing the user.
To see this more concretely, compare the complex forest of Java with the elegant simplicity of Cocoa. In my experience, the latter is far more intuitive and easy to work with, although that’s admittedly colored by the fact that I simply know it better.
If I do all this and find that (a) people actually use this software and (b) they provide good reasons for wanting interfaces, a refactoring can take place at that point. But I don’t anticipate it.