Job status & control
Today: I implemented the job status and control functions in DRMAA. Pretty straightforward: for status, I map the Xgrid status to a corresponding DRMAA value (this mapping having been run by the xgrid-users list). For control, I just grab the XGJob object corresponding to the job and send it the right message.
I had to leave out two DRMAA control options: hold and release. Apparently “hold” means “leave this job in the queue, but don’t allow it to run yet”; “release” puts it into a runnable state. Xgrid doesn’t support this. (Similarly, I previously had to ignore the job submission state attribute—on other systems, you can have a job start out in the “hold” state.)
The one trickiness with today’s code is that jobs don’t show up immediately in the grid’s jobs list, even after the XGActionMonitor for monitoring submission indicates success. So I added some code in -[XgridDRMAASession submitXgridJobWithSpecification:] to wait for the job’s presence in the grid’s list before returning. That way, any subsequent calls regarding that job are guaranteed to find it as long as it hasn’t been removed from the grid.
Side note: as required by DRMAA, all the methods are thread-safe. So far, this just means putting @synchronzed(self) { } around the bodies of all the methods; all the Xgrid stuff is handled on a dedicated Xgrid thread.