Checking required software¶
An often occuring theme in bioinformatics is installing software. Here we wil
go over some steps to help you check whether you actually have the right
software installed. There’s an optional excerise on how to install sickle
.
Programs used in this workshop¶
The following programs are used in this workshop:
All programs are already installed, all you have to do is load the virtual environment for this workshop. Once you are logged in to the server run:
source /proj/g2014113/metagenomics/virt-env/mg-workshop/bin/activate
You deactivate the virtual environment with:
deactivate
NOTE: This is a python virtual environment. The binary folder of the virtual environment has symbolic links to all programs used in this workshop so you should be able to run those without problems.
Using which to locate a program¶
An easy way to determine whether you have have a certain program installed is by typing:
which programname
where programname
is the name of the program you want to use. The program
which
searches all directories in $PATH
for the executable file
programname
and returns the path of the first found hit. This is exactly
what happens when you would just type programname
on the command line, but
then programname
is also executed. To see what your $PATH
looks like,
simply echo
it:
echo $PATH
For more information on the $PATH
variable see this link:
http://www.linfo.org/path_env_var.html.
Check all programs in one go with which¶
To check whether you have all programs installed in one go, you can use which
.
bowtie2 bowtie2-build velveth velvetg shuffleSequences_fastq.pl parallel samtools Ray
We will now iterate over all the programs in calling which
on each of them.
First make a variable containing all programs separated by whitespace:
$ req_progs="bowtie2 bowtie2-build velveth velvetg parallel samtools shuffleSequences_fastq.pl Ray"
$ echo $req_progs
bowtie2 bowtie2-build velveth velvetg parallel samtools shuffleSequences_fastq.pl
Now iterate over the variable req_progs
and call which:
$ for p in $req_progs; do which $p || echo $p not in PATH; done
/proj/g2014113/metagenomics/virt-env/mg-workshop/bin/bowtie2
/proj/g2014113/metagenomics/virt-env/mg-workshop/bin/bowtie2-build
/proj/g2014113/metagenomics/virt-env/mg-workshop/bin/velveth
/proj/g2014113/metagenomics/virt-env/mg-workshop/bin/velvetg
/proj/g2014113/metagenomics/virt-env/mg-workshop/bin/parallel
/proj/g2014113/metagenomics/virt-env/mg-workshop/bin/samtools
/proj/g2014113/metagenomics/virt-env/mg-workshop/bin/shuffleSequences_fastq.pl
In Unix-like systems a program that sucessfully completes it tasks should
return a zero exit status. For the program which
that is the case if the
program is found. The ||
character does not mean pipe the output onward as
you are probably familiar with (otherwise see
http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-4.html), but checks whether the
program before it exists succesfully and executes the part behind it if not.
If any of the installed programs is missing, try to install them yourself or ask. If you are having troubles following these examples, try to find some bash tutorials online next time you have some time to kill. Educating yourself on how to use the command line effectively increases your productivity immensely.
Some bash resources:
- Excellent bash tutorial http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html
- Blog post on pipes for NGS http://www.vincebuffalo.com/2013/08/08/the-mighty-named-pipe.html
- Using bash and GNU parallel for NGS http://bit.ly/gwbash
(Optional excercise) Install sickle by yourself¶
Follow these steps only if you want to install sickle
by yourself.
Installation procedures of research software often follow the same pattern.
Download the code, compile it and copy the binary to a location in your
$PATH
. The code for sickle is on https://github.com/najoshi/sickle. I
prefer compiling my programs in ~/src
and then copying the resulting
program to my ~/bin
directory, which is in my $PATH
. This should get
you a long way:
mkdir -p ~/src
# Go to the source directory and clone the sickle repository
cd ~/src
git clone https://github.com/najoshi/sickle
cd sickle
# Compile the program
make
# Create a bin directory
mkdir -p ~/bin
cp sickle ~/bin