Skip to content

Farm Software Installation

Dianne Velasco edited this page May 5, 2017 · 1 revision

Buiding C and C++ Software

When using farm, you're probably going to need to install some software. Since farm is a Linux computer, it follows much the same procedure as other Linux computers. The main difference is that you can't install things into system directories (you don't have permission to do this). You can create your own ~/bin/ directory for your own programs. However, if you suspect your program is going to be widely used by other lab members, send a request to CSE Help to have them install it globally.

Below is an example of how to build software to be installed in your ~/bin.

  1. Download software:

     $ wget http://www.best-software-ever.com/thesis-in-a-box.tar.gz
    
  2. Untar it:

     $ tar -xvzf thesis-in-a-box.tar.gz
    
  3. Move to your new directory:

     $ cd thesis-in-a-box
    
  4. Configure the program:

     $ ./configure --prefix=/home/<username>/bin
    
  5. Compile the program:

     $ make
    
  6. Move it to the path you specified:

     $ make install
    

Note: this only works for software that is packaged to be installed this way. Always make sure to read the README or INSTALL files to ensure you're doing it right. Lastly, this will only install binaries into bin/ -- if your program needs to install files to include/ or lib/, it's best to install these globally by asking CSE Help.

Then, to run the program just type in the path:

    $ /home/<username>/bin/thesis_program

Now, that's a lot to type every time you want to run a program. There are multiple better ways.

Add to $PATH

When you type in a command into bash, bash looks for a binary file matching that command. The places it looks by default on farm are:

/share/apps/ge-6.2/bin/lx24-amd64
/usr/kerberos/bin
/usr/local/bin
/bin
/usr/bin

We can shorten the command we have to type by adding the binary file to the path. To do this, we add it to the .bash_profile.

  1. Make sure you're in the home directory

     $ cd
    
  2. Create .bash_profile

     $ touch .bash_profile
    
  3. Open it up in your favorite text editor

     $ nano .bash_profile
    
  4. Add to your path by typing this in:

     PATH=$PATH:/home/user/bin/
     export PATH
    
  5. Save the file. In nano this is control+O (WriteOut)

  6. Reload bash by logging out or with this command:

     $ source ~/.bash_profile
    

This will add all binary files in /home/user/bin/ including the one we want, thesis_program to your $PATH. To see what is in your path, type this into bash:

$ echo $PATH

To add more folders to your path, just append it to the line with a colon in the beginning:

PATH=$PATH:/home/user/programs/:/home/user/who-needs-organization/bin/
PATH

Case study: msstats

Msstats is a program that uses libsequence. It's useful to see how this is installed because it requires libraries to be loaded during installation.

  1. Download msstats source code:

    $ wget http://molpopgen.org/software/msstats/msstats-0.3.1.tar.gz
    
  2. Untar:

     $ tar -xvzf msstats-0.3.1.tar.gz
    
  3. cd to new directory:

     $ cd msstats-0.3.1
    
  4. Read the README:

     $ less README
    
  5. Load the required modules:

     $ module load gcc
     $ module load libsequence
    
  6. Provide configure with information about libsequence and run configure (notice the prefix option; that is all one line):

     $ CPPFLAGS=-I$LIBSEQUENCE/include LDFLAGS=-L$LIBSEQUENCE/lib ./configure --prefix=/home/[user]/programs/
    
  7. Make:

     $ make
    
  8. Make install:

     $ make install 
    

Java programs

Using Java programs is easy because Java is inherently cross-platform. Java works by creating a layer between the computer and the program called the Java Virtual Machine (JVM). The JVM is platform specific, but any java executable file (.jar) can be run. This way, software people only have to send out 1 file which can be used on (almost) any platform. This also means that you don't have to compile the Java program (the software provider already did it for you), you just need to run it. A short example involving BEAGLE follows:

  1. cd to the parent directory:

      $ cd programs
    
  2. Download BEAGLE with wget:

     $ wget http://faculty.washington.edu/browning/beagle/beagle.jar
    
  3. Run BEAGLE according to the instructions provided here: http://faculty.washington.edu/browning/beagle/beagle_3.3.2_31Oct11.pdf For example:

     $ java -Xmx800m -jar beagle.jar data=data.bgl trait=T2D out=example
    

As you can see, Java programs don't really need to be "installed", they just need to exist. If you do need to compile the program, you will need to follow the instructions provided. If the source code contains a file called build.xml it probably uses ant (which is similar to make). If this is the case, you can create a .jar file simply by typing:

    $ ant compile jar

More information about ant is available here: http://ant.apache.org/manual/tutorial-HelloWorldWithAnt.html

Programs from Github

Programs, pipelines, and scripts from Github work much the same way as software from other sources. One key difference is in acquiring it. There are two main ways to get software from Github: cloning and downloading a zip. The advantage of cloning is that if the software updates, you just need to pull the changes and reinstall and you are updated. With the zip file, you need to remove the folder and redownload it when there in an update. However, sometimes software isn't updated frequently so you don't need to worry about updates.

Cloning

  1. Go to a scratch directory for compiling/building tools. I use ~/toolbuilds.

     $ cd ~/toolbuilds
    
  2. Visit the Github repository and on the very right there will be a text box that says HTTPS clone URL and it will contain a link that looks like this: https://github.com/[user]/[repository].git. Copy that link to the clipboard

  3. Back in farm, clone the repository:

     $ git clone https://github.com/[user]/[repository].git
    
  4. Change to the directory you just created (the name of the repository) and follow the installation instructions that are (hopefully) provided. Then, copy the binary files to your ~/bin/. Ideally, use cp -i so binaries with the same name won't be silently overwritten.

Github Repositories from a Zip

  1. Go to the parent directory of where you want the software installed

     $ cd toolbuilds
    
  2. Visit the Github repository and on the very right there is a button that says Download zip. Right click this link and click copy link.

  3. Back in farm, download the zip with wget:

      $ wget https://github.com/[user]/[repository]/archive/[branch].zip
    
  4. Unzip this folder with unzip:

     $ unzip [repository]
    
  5. Change to the directory you just created (the name of the repository) and follow the installation instructions that are (hopefully) provided.

Modules

Farm already has many programs installed, but these may not be loaded. Use module to load these pre-intalled programs.

$ module load gcc
Module GCC 4.5.0 Loaded.

To see what modules are available, type:

$ module avail

Any one of these modules can be loaded and should be. The special farm thing here is that if you run python, for example, without loading the module, you'll run version 2.4.3. However, if you load the python module, you will get 2.7.3. This becomes an issue when you run python from a script but forget to load the module because you'll be running an old version of python. So remember, if you're using a preinstalled program, make sure you load the module.

Clone this wiki locally