Using ‘Parallel’ in Unix

Installation

Parallel enables you to run basic unix commands on multiple threads at the same time. If it’s not installed on on your linux system, do so with:

cd $HOME/bin/ #cd to where you want it
wget http://mirror0.babylon.network/gnu/parallel/parallel-20160222.tar.bz2 #download
tar xvjf parallel-20160222.tar.bz2 #extract to same location
cd parallel-20160222/ #cd into parallel directory
./configure --prefix=$HOME && make && make install #install (for if you are not root)

Usage

Once installed, you can use it with basic unix commands. Note that you will normally be limited by your hard drive read/write speed before this makes a difference, unless you are using an ssd. Normally for example we would count the number of lines in a massive file with

wc -l reads.fastq.gz

But we can run this in parallel with

cat reads.fastq.gz | parallel --jobs 8 --pipe wc -l | awk '{s+=$1} END {print s}'

Note that –jobs 8 is specifying 8 cores to perform the task. You can remove this and it will automatically choose the maximum number possible on your system (although if you are on a shared supercomputer you should specify a limited number)