VSC launching batch jobs

Since my “developmental phase” on the VSC is slowly going into a more “production phase”, I’m starting to launch bigger an bigger projects and datasets. When executing exactly the same scripts for multiple samples, it is not visible to change each script for each sample. Therefor I need a way to launch these batches of samples. From the VSC they support multiple options, but there are also some other “not” supported options possible:

#example for on 1 node:
#get the number of cores in this node (optimization)
CORES=`nproc`
#commands.txt is a file containing all sh scripts which should be executed
#{} will be the path of the sh file, the sh file will be executed, and a log and error file will be created
cat commands.txt | parallel -j $CORES 'sh {} > {}.log 2> {}.err'
 On multiple nodes:
#example for multiple nodes, 1 task per node (can be changed to multiple tasks by changing the -j option like above)
#get the list of nodes reserved by this job (each node should only be mentioned once):
cat $PBS_NODEFILE | sort | uniq > nodefile

#setting parallel environment, so that all modules loaded and all environment variables of the pbs script are also available on the other nodes
export PARALLEL="--workdir . --env PATH --env LD_LIBRARY_PATH --env LOADEDMODULES --env _LMFILES_ --env MODULE_VERSION --env MODULEPATH --env MODULEVERSION_STACK --env MODULESHOME --env OMP_DYNAMICS --env OMP_MAX_ACTIVE_LEVELS --env OMP_NESTED --env OMP_NUM_THREADS --env OMP_SCHEDULE --env OMP_STACKSIZE --env OMP_THREAD_LIMIT --env OMP_WAIT_POLICY";
#end setting parallel environment

cat commands.txt | parallel --sshloginfile nodefile -j 1 'sh {} > {}.log 2> {}.err'

As a conclusion I think worker is the prefered system for most standard, small scale users with low HPC and programming experience. However when you want to perform large jobs, with large amount of data, and a high number of repeating jobs, worker seems to reach some limitations. Therefor I will keep on using worker in the teaching for new users (see also my workshop VSC NGS workshop). For my own tools and scripts I switch to the parallel implementation (currently only available in the DEV branch of my pika tool).