Archive for May, 2006

Grid Engine installation, Episode III

Tuesday, May 30th, 2006

Now that the qmaster is set up and NFS is set up, I can finally set the machines up as execution hosts. I’m doing it in parallel on both machines. cd /usr/local/gridengine, sudo -s, and finally ./install_execd, and we’re on our way to another bulleted list describing lots of screens…

  • Welcome Why, thank you!
  • Checking directory Looks good: /usr/local/gridengine
  • Cells Also good: algore
  • Checking hostname resolving This worked fine out of the box on the qmaster machine, but on the PowerBook the Bonjour name wasn’t getting resolved properly. So I added an entry to /etc/hosts on astor.local: [local IP address] darwin.local. That fixed things.
  • Local spool directory configuration No local spools.
  • Creating local configuration Done!
  • execd startup script Yes! Done!
  • execution daemon startup Started up!
  • Adding a queue for this host Done on both. Looks like the 2-processor G5 detected two processors, and the 1-processor PowerBook detected 1. Smarty smarty. But a problem: “unable to resolve host [‘darwin’ | ‘astor’]”…I hope this doesn’t mean everything breaks.
  • The rest… is just information already shown during the other installation. I hope that name-resolution problem doesn’t bite me in the ass.

Well, it looks like everything’s done. Testing…tomorrow. Time to sleep.

Setting up NFS

Tuesday, May 30th, 2006

Turns out you have to have an NFS share for your SGE_ROOT directory. So I set up NFS.

I followed the GUI instructions from this one. In short, you add an /exports entry to NetInfo with settings for the directory you want to export. I couldn’t get the exports to show up right for a long time, but restarting the machine fixed that problem.

To set up an NFS automount on the PowerBook, there’s some more setup to be done, described here. The gist is to set up a NetInfo entry in /mounts for the server.

End result: my /usr/local/gridengine on darwin.local maps to the same directory on astor.local.

Grid Engine installation installation

Monday, May 29th, 2006

Fresh from a nice Memorial Day picnic lunch in Dolores Park, it feels like time to take a nap. But I’m going to install the Grid Engine instead! Here comes the installation part of the installation process.

Getting the Software

I downloaded the Grid Engine 6.0u8 common files and Mac OS X binaries linked from here and unpacked the contents of each into /usr/local/gridengine on both of my machines.

Then I set the $SGE_ROOT environment variables in the system-wide /etc/bashrc file, and added the binary directory to the standard $PATH:


export SGE_ROOT=/usr/local/gridengine
export PATH=$SGE_ROOT/bin/darwin:$PATH

and did a source /etc/bashrc to update my session’s environment variables.

Setting up the Master Host

Making sure I was in the $SGE_ROOT directory and in a sudo -s session, I ran this on good-old Astor:

./install_qmaster

I followed through some screens:

  • Admin user At the first screen, I said OK to use ebaskerv (my user account) as the admin user.
  • root directory The root directory was right.
  • TCP/IP services As requested, I added sge_qmaster to my /etc/services file, and in anticipation added one for sge_execd:
    sge_qmaster 781/tcp
    sge_execd 782/tcp
  • Cells Named my cell algore, as promised.
  • qmaster spool directory Default is fine: /usr/local/gridengine/algore/spool/qmaster
  • Windows Execution Host Support Are you going to install Windows Execution Hosts? Are you kidding me? At least, by Judas, the default is no.
  • File permissions I said no when asked if I had already verified and set file permissions. My guess is these would need fixing. I said yes at the next screen (please verify and set my permissions) and all looked hunky-dory (“Your file permissions were set”).
  • Hostname resolving method This asks if all my hosts are in one DNS domain. I’m going to cross my fingers and hope that the zeroconf pseudo-domain local. will work, and answer yes.
  • Making directories This seemed to go fine. (“Mrs. Crabapple and Principal Skinner were in the closet making directories, and I saw one of the directories, and the directory looked at me!”) RETURN!
  • Setup spooling I chose classic spooling, because I had this suspicion that BerkeleyDB wasn’t ever installed properly on my machine. I’m looking for simplicity, not performance. The spooling database seemed to be initialized properly on the next screen.
  • Group id range For some strange reason, the Grid Engine needs a range of UNIX group ids to assign dynamically to jobs. I’m pretty sure the example range 20000-20100 is free and large enough, so I’ll use that.
  • Cluster configuration First up: execd_spool_dir. The default seems fine. Then, administrator email: I gave it my email, but I don’t think email sending is even set up right on my machine, so it probably won’t work.
  • Creating local configuration This seemed to work…
  • qmaster/scheduler startup script Apparently, it knows how to set up a startup script. I’ll let it go ahead and try…wow! It put something in /Library/StartupItems! Clever girl.
  • qmaster and scheduler startup Started up successfully!
  • hosts This is easy: just two for now. astor.local. and darwin.local., maybe more later. (They misspelled “separated” in “Please enter a blank seperated list of hosts.”) This seemed to go correctly. I said no to a shadow host, partially because I like to live dangerously, and mostly because my grid consists of two computers. Then, the default queue and hostgroup were added: just astor.local.—maybe I have to add darwin.local. manually later.
  • Scheduler tuning Went with Normal.
  • Using gridengine Looks like they provide a nice script to set all the environment variables. So I replaced my old bashrc line with:
    . /usr/local/gridengine/algore/common/settings.sh
  • Messages FYI, messages logged to:
    /tmp/qmaster_messages
    /tmp/execd_messages
    /usr/local/gridengine/algore/spool/qmaster/messages
    [execd_spool_dir]/[hostname]/messages
    and startup scripts are at:
    /usr/local/gridengine/algore/common/sgemaster (qmaster and scheduler)
    /usr/local/gridengine/algore/common/sgeexecd (execd)
  • Almost done “Your Grid Engine qmaster installation is now completed” says the friendly screen. Now I get to start the execution host installation. Next post.

Grid Engine installation preparation

Monday, May 29th, 2006

Here goes trying to install the open-source Grid Engine 6.0u8 on Tiger. It would be nice if there were a Mac OS X installer package…if I have extra time (ha) maybe I’ll put one together.

I can already see that Xgrid is an infinitely simpler system. Apple wins on ease-of-use already—just based on the instructions in the Plan the Installation section of the Grid Engine manual.

SGE, on the other hand, looks way more powerful. Sophisticated scheduling, intelligent matching of available resources to job needs, etc., etc. I like.

For my own personal use, Xgrid looks great. But I’m going to slog through, because I think I’d better get some hands-on use of the reference implementation of DRMAA before writing my own new implementation.

First, some preliminary notes on how the Grid Engine works…

Definitions

master host
Runs master daemon and scheduler daemon—basically, controls the system. Equivalent to the Xgrid controller. By default, also an administration host and submit host.
shadow master host
A system that can detect a failure of master and take over. Despite my mission-critical enterprise-grade infrastructure, I won’t bother dealing with these.
execution host
Systems that execute jobs. Equivalent to an Xgrid agent.
administration host
Systems that carry out any “administrative activity.” I guess this means editing jobs, adjusting controller settings, etc.?
submit host
Systems that allow users to submit batch jobs. Like an Xgrid client.
queue
Container for jobs that can run on one or more hosts concurrently. Sort of a sub-grid. Can include any subset of hosts on the system.

Daemons

sge_qmaster
The master daemon. Handles all controller activity except scheduling decisions.
sge_schedd
The scheduling daemon—decides where to send jobs, how to order & priorities.
sge_execd
Execution daemon—actually runs jobs. Runs on execution hosts.

With this background, I can actually start thinking about how the hell to set up my own system! Here are the decisions I made for my giant 2-host grid:

Decisions

  • Single cluster My system will be a single cluster, rather than a collection of sub-clusters. My system consists, at last count, of my personal machines: a G5 and a four-year-old PowerBook. I’ll try to convince my roommates to let me use their machines too. At least they’re all connected via InfiniBand! Ha, just kidding.
  • Hosts The G5 will be everything: master, administration, submit, and execution. The PowerBook will be everything except a master.
  • Users “Ensure that all users of the grid engine system have the same user names on all submit and execution hosts.” This isn’t a decision! It’s an order!
  • Software Directories I guess I’ll put a full directory tree on both machines so I don’t have to think about what to install and what not to install.
  • Queue Structure One grid, one cluster, one queue; will include all (2) execution hosts. Easy peasy.
  • Network Services I have no idea what an NIS file is (Solaris thing?), so I guess that means I’ll set things up as “local to each workstation in /etc/services”.
  • Gathering Information Another command: “Use the information in this chapter to gather the information necessary to complete the installation worksheet.” Decisions my ass.

I guess I’ll fill out their silly little worksheet. It looks like it might be useful…

Necessary Information

Parameter Value
sge-root directory /usr/local/gridengine
cell name George W. Bush! My hero! Er, no, I’ll call it Al Gore.
administrative user ebaskerv (c’est moi)
sge_qmaster port number Uh…we’ll see what they use in the default file.
sge_execd port number Ditto.
master host astor.local., G5 of my heart
shadow master hosts Nada.
execution hosts astor.local. darwin.local.
administration hosts astor.local. darwin.local.
submit hosts astor.local. darwin.local.
group ID range for jobs I have no freaking clue. With one grid, probably doesn’t matter.
spooling mechanism Classic spooling sounds easier than messing with Berkeley DB.
Berkeley DB server host NA
Berkeley DB spooling directory NA
scheduler tuning profile “Normal” sounds good to me.
installation method automated?
If you are going to install N1GE 6 on a Windows system, acquire and install Microsoft Services for UNIX. See Appendix A for more information. What is this Windows you speak of?
If you are going to install N1GE 6 on a Windows system, create the required CSP certificates before installing N1GE. See the section called “How to Install a CSP-secured System” in Chapter 4 for information about CSP certificates. I see, it must be an operating system for people who want things to be even more complicated.
Check the Other Installations Appendix for applicability. Aigoo!

This post is getting very long. Ah well, I press on.

Aw, fuck, I just noticed they have a guide to all of those table entries. Let’s see if that changes anything…well, they use 536 and 537 as ports in their example. Maybe those are free. And perhaps interactive installation will be better.

Well, it looks like it’s time to start installing. I’ll cover that in the next post.

GridSweeper preliminaries

Monday, May 29th, 2006

Today I begin work on GridSweeper. (Which, I learned through Google, shares a name with what looks like a MineSweeper clone, detailed about halfway down this page. I’m not particularly worried about confusion.)

I’m not going to write any code of significance this week: rather, I’m going to get really familiar with Sun’s Grid Engine system, DRMAA, and test out doing manual runs of Repast, etc. with Xgrid and the Grid Engine, just to see what it will take.

I was worried about how the Grid Engine DRMAA implementation worked in Java—at first glance, I saw classes in org.ggf.drmaa and worried that Sun’s DRMAA packages were implemented inside that package. In fact, it’s nicely separated: the DRMAA interface is in org.ggf.drmaa, and com.sun.grid.drmaa contains the Sun implementation. So I can just put an Xgrid implementation in com.edbaskerville.xgrid-drmaa or something like that—and you’ll be able to select between the two grid systems *at runtime*!

To elaborate on the deliverables listed in the proposal, these are the pieces I plan to build:

  • GridSweeper The actual project. See the proposal.
  • Xgrid DRMAA implementation (C) This will be the meat of my DRMAA work. It will wrap the Objective-C XgridFoundation library in the standard DRMAA C interface, all packaged up in XgridDRMAA.framework.
  • Xgrid DRMAA implementation (Java) Mirroring the SGE Java implementation, this will just be a JNI wrapper for the Xgrid DRMAA implementation in C. Packaged in com.edbaskerville.xgrid-drmaa, class files included in XgridDRMAA.framework.
  • Objective-C DRMAA interface If I have time, an Objective-C wrapper for the C API. (Not useful for GridSweeper, but just a nice thing to have, and not very much work!) I’ll propose this to the drmaa-wg as a standard interface. Yes, that’s right: this will be an Objective-C wrapper for the DRMAA C interface layer to the Objective-C XgridFoundation API. But I’ll also make this play nice with SGE—in short, mirror the structure of the Java APIs, allowing you to select from different implementations at runtime. This will be included in XgridDRMAA.framework.

The goal with the Xgrid DRMAA stuff is to have Apple roll it into Mac OS X someday, replacing my C wrapper to their Objective-C API with something that connects directly to the Xgrid internals. It would be nice for all this stuff to be in XgridFoundation someday.

MIDI reimplemented

Monday, May 29th, 2006

Last night I reimplemented MIDI support in LilyPad. It turned out to be really easy to use the AudioToolbox framework’s MusicSequence and MusicPlayer classesopaque C data types + functions. One bug to be fixed: if a new preview happens while you’re playing, the old sequence continues to play. This can be comical, but not useful.

Parameters re-thunk with annotations

Saturday, May 27th, 2006

I currently have code that automatically applies parameter values from a hash map to a model by reflectively looking up the appropriate setter methods.

I mentioned that I’d rewrite this to use the reflection API to directly set fields. In fact, it would be better to fully implement the Key-Value Coding hierarchy of Cocoa: first look for a setter method, and then directly set the field.

Directly setting the field, of course, breaks data hiding. This feature is meant for debuggers, not for applications. So, rather than just letting the standard parameter-setting capabilities set any field, there will be an annotation, @Parameter, to specify that a field should be accessible as a parameter to be set during a model’s MyModel(Map<String, Object> parameters) constructor.

There will be no setParameter method for use at runtime, because the only time parameters should be set is during object creation. Any class can choose to bypass this restriction by providing setter methods or declaring a parameter public, but

What about reading parameters at runtime? Just use the standard methods. A model will be able to access its own fields as usual. If the model wants other objects (such as its children) to have access, it should declare an accessor method.

I was hoping it would be possible to modify final fields using reflection in the constructor, but this is not the case (at least not with JDK 5.0—apparently it was possible in 1.3.0). That would let you do things like declaring parameters with @Parameter public final int beta = 0.3;, where the default value of a parameter is set in its declaration, AND it’s publicly accessible (for speed and convenience) without being modifiable, AND it gets automatically set in the constructor. I guess I can live with @Parameter public int beta = 0.3; (dangerous) or @Parameter private int beta = 0.3; + an accessor (safe).

In short, parameters will work like this:

  • Parameter fields are declared with the @Parameter annotation.
  • By calling the Model parameter-setting constructor in a subclass constructor, all parameters present in the parameter map will get set automatically.
  • Normal access modifiers will apply at runtime.

GridSweeper proposal accepted!

Wednesday, May 24th, 2006

I just checked my Google Summer of Code status page for the four hundred millionth time in the last few weeks. Lo and behold: Application accepted!

More details to come. Let the coding begin!

LilyPad today

Monday, May 22nd, 2006

I made a few fixes in LilyPad today to take care of two bugs:

ID 9

Regarding auto-preview, Henning Ramm had written:

It’s really a nice feature, but doesn’t yet behave “just right”. I’d like to start it with every “save” (or after the delay) and then run through (but stoppable), regardless what I type in the meantime. For it’s not possible to just preview the part I work at, that would show my progress better than if I must stop typing to see anything. Perhaps you can make that behaviour configurable, too.

Done. I didn’t bother to make it configurable—if somebody complains about the default behavior, I can do that.

ID 8

Also a request from Henning:

- increase/decrease indent of marked lines using tab/shift-tab (the actual keys don’t work on every int’l keyboard)

Also done. I was previously of the opinion that doing this would violate standard OS X behavior. But this is how Eclipse does it, and I like it. So I’m going with it.

Events in EcoGillespie

Saturday, May 20th, 2006

Today I figured out an interface for having agents (individuals (models)) specify events in EcoGillespie. Each possible event will be, surprise surprise, an object.

I considered a few ways of structuring this object. The following were rejected for fairly obvious reasons:

  • Specify events as Method objects Seemed like the right approach at first, because then you can just directly implement events as methods in the class. But it’s ultimately a bad idea. Doing indirect method invocation using Java reflection is way, way slower than direct method calls. Event methods are going to be called all the time. Furthermore, I’d have to wrap Method in an Event object anyway, since you might want to specify arguments to the methods.
  • Use a string or int or enum The idea here is to just specify different events using some identifier that gets parsed out by a standard method implemented by the model. It’s basically indirect method calling, but faster, and uglier. Still indirect, though.

The structure I adopted is to represent events using inner classes. Each event is an instance of a subclass of Model.Event, which the model implementor will define as, e.g., MyModel.MyEvent. The event includes a rate (for use in the Gillespie algorithm) and one required method, perform(), which actually performs the event. Because Event is an inner class of Model, it can access members and methods of Model directly. So with just a few extra curly braces and so on, you get all the advantages of passing Method objects without the slowdown.

A modeler could decide to have just one Event subclass, and add member data to store the specifics of the event. Or if there are a few distinct types of events, each of which has very different behavior, there could be separate subclasses for each type, with different implementations of perform(). It doesn’t matter—all the framework cares about is the presence of the rate value and the perform() method.

Pretty slick, I think.