We have a compute cluster at work that uses IBM platform computing's LSF to schedule jobs. It works well on the whole, but given the kind of jobs we run mostly: hundreds to tens of thousands of very similar simulations with differing input conditions; then it would work better if LSF was based on the running of arrays of jobs rather than the single job submission we have now.
LSF (and others such as the Sun Grid Engine), does have its Job Array feature that allows me to submit an "array" of 25K jobs to a queue in one go, but internally they get administered as 25K single jobs. This leads to problems in scale, if I want to change the parameters of the job array then I can use their bmod command on all 25K elements of the array at once, but this may cause the whole LSF system to slow, with others having to wait to submit jobs as LSF internally makes identical changes to 25K separate jobs. If I have five million jobs to run in 300 job arrays over a week, I can't submit them at once as LSF gets a heart attack!
I think its time that job schedulers were written to only work on job arrays - a single job should be treated as a special-cased one element array with extra support from the LSF commands but internally arrays should be king, allowing the better use of multi-core compute clusters; after all, with internally developed tools, opensource tools, and all-you-can-eat licensed proprietary software, together with self checking / automatic triage of simulations; you can think differently about what it is possible to achieve on your companies compute farm, and at the moment, poor implementations of job arrays get in the way.
And whilst I am at it : If EDA companies want to create their own regression runners, then they should support job arrays too. And I mean proper support that actually works when tried rather than being in the manual but not actually working (I won't name anyone/anytwo)!
I have sometimes used SCOOP (http://code.google.com/p/scoop/) with Grid Engine, and it works fine for bundling Python jobs (like map over an array or more fancy setups) except that I have had problems integrating it with qdel. It used to work back when they used MPI as the backed, now with zmq the "cleanup workers" functionality no longer seems to work. Otherwise, an excellent tool
ReplyDelete