Torque scheduler provides control over job submission to a cluster. Sometimes a job that you are submitting requires the output of mutliple other, already running jobs for its input. So you either have to wait, then manually start the next job when the others have finished, or you can follow this.
Say we have three simple scripts (all executable):
The first outputs text to output.txt
#!/bin/bash #PBS -l nodes=1:ppn=1,walltime=600 cd /your/directory/here/ echo job1 output > output.txt
The second appends text to output.txt
#!/bin/bash #PBS -l nodes=1:ppn=1,walltime=600 cd /your/directory/here/ echo job2 output >> output.txt
The third counts the lines in output.txt before appending this number with a job completion statement.
#!/bin/bash #PBS -l nodes=1:ppn=1,walltime=600 cd /your/directory/here/ wc -l output.txt >> output.txt echo job3 output: the number above=2 if this job executed after jobs 1 and 2 >> output.txt
We want the output of job1 and job2 to be in output.txt before job3 runs and counts two lines in the output file. If we submitted them all together, the wc -l command would not return 2, as all jobs would be running at the same time.
To control the order of submission we can make an additional master script that controls dependencies:
#!/bin/bash #submit job1 with no dependencies job1=$(qsub job1.pbs) echo $job1 #submit job 2 only after job1 has finished job2=$(qsub -W depend=afterok:$job1 job2.pbs) echo $job2 #submit job 3 only after jobs 1 and 2 have finished job3=$(qsub -W depend=afterok:$job1:$job2 job3.pbs) echo $job3
if we run it with ./master.pbs this gives us the output that we are looking for:
$ cat output.txt job1 output job2 output 2 output.txt job3 output: the number above=2 if this job executed after jobs 1 and 2
This can also be useful if different parts of your workflow make use of different amounts of resources.