Job management made easier
Last week we saw how to launch jobs using the submit command. After doing that you usually need to check running jobs, and occasionally modify or cancel them. In this post we will look at a few other commands that allow you to do just that.
The first one, and the simplest, is the qmine command. This simply prints a list of jobs running under your username. It is similar to the squeue command, but by default only shows you your own jobs, which is what you normally want. You can also see jobs by a different user specifying a username as the first argument. The output consists of ten columns:
- Job id and job name;
- Account, group, and QOS;
- Time, CPUs, and memory requested (note: these are the limits you asked for during submission, they are not the resources that your job is currently using);
- State, written out in full (e.g., RUNNING, COMPLETING);
- Comment (this is a string that can be set with the -p option in submit).
The module offers two utilities to quickly cancel multiple jobs. The first one is qdelmine, that simply cancels all jobs belonging to you. Use with caution!
The second command is qdelname, which selects jobs to be canceled based on their name. For example, to cancel all your running bowtie jobs, you can do:
$ qdelname bowtie
This command will first print a list of jobs containing the word you specified (a simple case-insensitive grep is used, so you can provide a partial name), and then it will ask you for confirmation that you really want to cancel those jobs. If you reply “y”, the jobs will be canceled. This command can also be invoked as jobdel.
The jobupd command allows you to change some parameters of running jobs. A typical example of a case in which you may want to do this is to change a job’s qos while it is still queued. As in the case of qdelname, you need to specify a (partial) name for the jobs to be modified, and the program will print all matching jobs before proceeding. For example, if you have a set of bowtie jobs in qos “test”, and you want to switch them to “test-b”, you can do:
$ jobupd bowtie qos=test-b
You can also supply multiple directives to change many parameters at once. This command uses scontrol internally, so every directive you can pass to scontrol is accepted here.
But wait, there’s more!
Jobupd and jobdel internally use the same command to generate the list of jobs matching the name you specify. The name of this command is jobids, and you can also invoke it directly:
$ jobids bowtie 1234567 1234568 1234569 ...
This allows you to write your own commands that operate on a set of jobs. You will probably not need this very often, since everything is pretty much already covered by jobupd, but at least the option is there!