Saturday, June 15, 2013

Installation from Tarball (Fermi Science Tools and HEAsoft)


(If you are not new to Unix-like OS, probably you already understand the installation instructions provided along with both pieces of software.)

Once you have a functioning operating system, e.g., Mac OS X that comes along with the computer you bought, or a Linux distribution you installed on your own (Why not Windows? The answer would deserve a separate article), you may want to install your favorite analysis software right away so you can start working. But how? Usually, these packages cannot be found in the repositories and cannot be installed by just clicking Next, Next, and Next -- rather, they come in the form of a tarball (.tar.gz, .tgz).

We'll use the official distribution of HEAsoft (6.13) as the main example*. The software comes in two kind of distribution: a source distribution and a binary distribution. The source distribution contains the software in its source code -- its primitive form. To install the software, we need to
  1. identify the compilers in the computer which translate the code into something your computer can understand (a binary), as well as the libraries in your computer that the software needs during execution (configuring);
  2. translate the code (building/compiling); and
  3. move the compiled files to appropriate places so your computer knows where to look when you try to use the software ("installing").
    These can be done by executing a few commands, which will be discussed in a second.
The binary distribution contains the software in its compiled form (possibly together with the source code if the software is an open-source one). As the name suggests, step (2) has already been done for you, so only steps (1) (to identify the libraries) and (3) are required.

*The installation of the Fermi Science Tools (FST) (v9r31p1) from only the binary distribution will come alongside in the relevant sessions. This seems to be in conflict with the title of the post, but the reasons are that:
  • building the FST from source involves extra branches which are too much to be fitted into this post;
  • installing the FST from binary is strongly recommended by the developers; and
  • the similarities in the procedures can still be accessed.

In general, if a software is organized in a way consistent with the GNU standards, the same procedure applies, so let's get down to brass tacks.


Unpack

(Instruction step 2)
Prior to step (1), we have to unpack the tarball we have downloaded in order to look inside. You may want to move the tarball to a location where you want the software to be installed at. A tarball refers a format of compressed, archived file. It resembles a zip file, but it consists of two steps during generation: the archive (tar) part and the actual compression (gz/gzip) part. To unpack the tarball, tar provides a one step solution (gunzip and tar in a single command):
tar -zxvf heasoft-<version>.tar.gz # for HEAsoft
tar -zxvf ScienceTools-<version>.tar.gz # for FST
The bit with the hyphen, -zxvf, is the option part of the command:
  • -z indicates that gzip is needed to decompress the file
  • -x tells tar to extract (actually!) the content of the tarball
  • -v stands for verbose and tells tar to print the names of the files being extracted; it is a very common option
  • -f tells tar that the name coming after it is a file
(You may be wondering why there is no hyphen in the command that appears in the FST installation instructions -- tar xvzf. This is because of some historical reasons. In general, options are signified by hyphens.)

Configure

(Instruction step 3)
We have come to the first step. There is a configure script for this. If you're installing from source, the script is directly under the directory BUILD_DIR, where you should now move your working directory to:
cd heasoft-<version>/BUILD_DIR/ # for HEAsoft
cd ScienceTools-<version>/BUILD_DIR/ # for FST
If you're installing from binary, there is one extra directory to go through:
cd heasoft-<version>/<platform>/BUILD_DIR/ # for HEAsoft
cd ScienceTools-<version>/<platform>/BUILD_DIR/ # for FST
You can now execute the configure script and let it check for compilers, dependencies, etc. The command is simple:
./configure
(Sometimes it is useful to write the output to a file for future reference, as suggested in the installation instructions.)
If any error message pops up at this stage, it is most likely caused by missing compilers or libraries; a quick look at the message will tell.

(If you're installing HEAsoft, you might get a Perl mismatch warning.)


Make

(Instruction step 4 & 5)
If you are installing from binary and pass the configure step, then the software is ready for use -- except that you will need a better organization of the compiled files, which is accomplished by the final step.

On the other hand, if you are installing from source, this is the crucial step and is the core of the procedure. If you haven't already noticed, the configure script has created (or modified) a file called Makefile in the working directory. This is the recipe for the compilation of the source code, which may take tens of minutes or even an hour (sit back and relax), depending on the size of the program and the speed of your computer. Despite the complexity and depth involved in this step, the command to start building is very simple (you should really consider writing the output to a file):
make
Translation isn't easy -- this is probably the step when most nasty errors come out, such as linker errors and library conflicts. The solution really depends on your machine, and we cannot cover that here. Nevertheless, you should not ignore any errors and skip to the next step before things get sorted; in general, the next step is irreversible -- there is no simple way to revert it if improperly compiled files get scattered across your file system.


Make install

(Instruction step 6)
If the previous step succeeded without any error, congratulations! The software is ready for use -- but the compiled files have to be moved to the right place so that they can be easily accessed when needed in the future. This is done by the command
make install
In general, software obtainable via the repositories and popular, community maintained packages (very loosely speaking) would have the default installation location set to a system directory (e.g. /usr/local), for which you will need root privileges. Once the files are in place, they are accessible from virtually anywhere.


Initialization (Setting up the environment)

(Instruction step 8)
In the case of HEAsoft and FST, the default installation is within the directory created when the tarball was unpacked (although you can specify a custom directory during the configure step), which is not as accessible as system directories from the point of view of your computer (the shell), so an extra step is required to set up the environment. In simpler words, it tells your computer there is actually some other places to look when you call the HEAsoft or FST tools (if you can't find the key in your pocket, don't panic -- it's probably somewhere on the floor).

The corresponding commands are (assuming bash):
export HEADAS=/path/to/installed/heasoft-<version>/<platform> # for HEAsoft
. $HEADAS/headas-init.sh
or for the FST:
export FERMI_DIR=/path/to/installed/ScienceTools-<version>/<platform> # for FST
. $FERMI_DIR/fermi-init.sh
However, these are for one-time use only: after you exit the current shell (closing the Terminal window), the information on the environment set-up is lost and when you start a new one, you have to enter the same commands again. There is a way to let the shell execute the commands automatically upon start, open the file ~/.bashrc with your favorite text editor (again, assuming bash) (or ~/.profile on Mac OS), and add the two line of commands -- it is advisable to give the second line a tweak:
export HEADAS=/path/to/installed/heasoft-<version>/<platform> # for HEAsoft
alias heainit=". $HEADAS/headas-init.sh" # don't forget the quotes
export FERMI_DIR=/path/to/installed/ScienceTools-<version>/<platform> # for FST
alias fermi=". $FERMI_DIR/fermi-init.sh"
The name alias may have already given you some hint -- every time you start a new shell, the environment set-up will not be complete until you explicitly demand it via the "shortcut" commands:
heainit # for HEAsoft
fermi # for FST
or whatever commands you are comfortable with that are unambiguous.

The reason for this is to avoid setting up all kinds of environment variables at each startup without your consent when many of them are useless to you doing some specific tasks. When you have more software installed, letting the environment initialization to carry out automatically will pour an excessive amount of information to the shell without checking for consistency. For example, two packages compiled with different versions of a compiler would conflict when the libraries are suitable for one but not the other.

Congratulations

The HEAsoft or the FST is ready on your computer -- but you've achieved more -- you won't be as afraid when you install the next piece of software.

Wednesday, June 12, 2013

Greetings


Hello, World! This blog is targeted at professional astronomers. It aims to discuss and share useful tips and tricks in the context of data analysis, ranging from data reduction with pipelining scripts to dealing with various errors that pop up during the compilation of a software package, and from basic pure text file manipulation to making fancy plots. As an astronomer who frequently deals with data and statistics you may, in many times, find yourself entangled with with a mess of numbers, figures and text, trying to rearrange and reformat them into something sensible, just to get prepared for the next steps of your calculation. Some posts would also be tailored to open some doors for beginners using Unix-like system.

Although the author tries to present methods accurately and in detail when necessary, most discussion and comments are based on the author's personal experiences. This does not serve as formal introduction to a particular subject, but is intended to be a place where small tutorials are given focusing on the caveats and special points to note, in the hope of rescuing souls from a pool of frustrating errors and obstacles, in a friendly and relaxing manner (hopefully, limited by the author's English skill). This is also a place to gather these souls, which would eventually find comfort by realizing that the suffering is not only personal.

Brace yourself.