Automating R Package Builds Summer 2009

Automating R Package Builds Summer 2009

Guide Overview

The program rbuild.pl propagates R package updates to a web site for public consumption. This guide describes rbuild in detail, including how it works and how to use the following features:

  • Paths
  • Configuration
  • Notification
  • Temporary files
  • Lockfiles and errors
  • Updates and forced rebuilds
  • Bugfix builds
  • Version numbers
  • Options
  • Text replacement

For questions, problems, or feature upgrades, contact us.

Script Process

Script Process

The program rbuild.pl runs regularly as a cron job, checking out the package from its CVS repository to compare the version number with the last update. If the version number has not changed, nothing more happens. Otherwise, rbuild does the following:

  1. Create directories in the destination path if necessary.
  2. Run R CMD build to generate the Linux-version tarball and copy to web site.
  3. Write new package description files.
  4. Generate new documentation (PDF and HTML) from TeX source.
  5. Install library on local machine.
  6. Notify concerned parties through email.
Paths

Paths

Paths relevant to use of rbuild include:

  • Executable - /usr/local/HMDC/bin/rbuild.pl
  • Global config file - /nfs/projects/r/rbuild/rbuild.conf
  • Log files with standard and error output - /nfs/projects/r/rbuild/log/*
    Note: Log files created only for projects with autobuild enabled
  • Lock files - /nfs/projects/r/rbuild/lock/*
Configuration

Configuration

The script reads a main configuration file that tells it which packages to work on, located in the rbuild project directory and called rbuild.conf. It then reads a configuration file in the top level of the R package's CVS repository also called rbuild.conf. This local configuration file contains the version number and other pieces of information, such as paths and parts of the package description file.

The rbuild.conf file is in XML format. It consists of nested information units called elements. An element looks like this:

<path role="tempdir">/tmp/R_src</path>

The values in angle brackets (character entities < and >) are tags, used to separate portions of data. The start tag has a name (also known as type), which is path in the example above, and a number of attributes, which are name-value pairs connected by equal signs. There is one attribute in the above example, role="tempdir". The end tag consists of only a slash character (/) and the element name. In between these tags is the content, which can be either text or other elements. Be careful when editing the configuration file to observe the XML well-formedness rules:

  • Every element must have both a start tag and an end tag.
  • Names in start and end tags much match exactly.
  • Element names are case-sensitive.
  • The character entities <, >, and &, when appearing in text must be entered as escape codes &lt;, &gt;, and &amp; respectively.
    Note: Be sure to match up your escaped characters.

A local configuration file contains these element types (and attributes):

  • package (name), identifies the package you are configuring and contains the other elements in this configration
  • option (role), sets an option for building, identified by the role attribute. For example: <option role="notify">foo@bar.edu</option> tells the program to send email notification of a build update to a user at the email address foo@bar.edu. Some additional options are:
    • CRANupload - Whether or not to upload package to CRAN (yes or no). Defaults to no.
    • bibtex - Whether or not to run bibtex against package documentation (yes or no). Defaults to yes.
    • latex2html - Whether or not to run latex2html to generate HTML documentation (yes or no).  Defaults to yes.
    • versiodocs - Whether or not to create a subdirectory in docsdir for each versions documentation (yes or no). Defaults to no.
  • path (role), sets a path to be used by the system, identified by the role attribute. Some paths are:
    • buildsrc - Subdirectory in the CVS repository containing source files
    • linuxdir - Where to install linux files
    • docsdir - Where to install documentation files
    • docsdir - Location of tex/Rnw files (relative to CVS root)
    • replace - Files to replace text in
    • movefromN - File (or directory) you want to move to a special location, with path relative to the top level of the module (N is a number)
    • movetoN - Destination for the above, which is the absolute path on the system (for example, /nfs/www/gking/zelig/index.shtml)
  • Author, used in the R package description file
  • Depends, used in the package description file
  • Description, used in the package description file
    Note: This field seems to need TABS after line breaks.
  • License, used in the package description file
  • Maintainer, used in the package description file
  • Title, used in the R package description file
  • URL, used in the package description file
  • Version, used in the R package description file, and used to trigger a new build
Notification

Notification

Rbuild uses email notification to let interested parties know about new versions of packages as well as any problems with builds. Notification is generated based on elements in the package's rbuild.conf:

  • <option role="notify">foo@bar.edu</option>
    This will set an address (or a comma-separated list of addresses) that will receive updates of new versions, bug fixes, and some build errors.

In addition to notification of developers, rbuild also now supports automatic upload and notification to CRAN. If the package being built by rbuild is available on CRAN, developers can set the following option in the package's rbuild.conf:

<option role="CRANupload">yes</option>

This option (disabled by default) will upload the resulting package to the CRAN ftp server and notify the CRAN project that there is a new version, thereby saving the developer the trouble of doing so themselves.

Temporary Files

Temporary Files

All of the output from rbuild.pl goes into a temporary location, usually /tmp/rbuild_PID, where PID is the process identification number. After doing all the processing of a project, if there have been no errors, these files are then copied to their final locations.

After copying files over, the temporary files are removed unless you used the -K (--Keep) option. Temporary files are cleaned up even in the case of a crash, because the temporary files may interfere with a fresh build.

In debug mode, which is used to test functionality, the files are never copied out of the temporary location. This allows you to go and inspect the output without having to overwrite the files on the web site.

If you suspect that the temporary files are still there and are interfering with your build, use the -c (--clean) option. This performs a cleanup operation before doing anything else. It also cleans up the temporary files in debug mode, overriding the tendency to keep them there.

Lockfiles and Errors

Lockfiles and Errors

As rbuild builds each package, it checks for a lock file in /nfs/projects/r/rbuild/lock of the form of PKG or building-PKG, where PKG is the package name (e.g. "zelig"). If it finds the lock file, it skips building that package. The build lockfile is removed at the end of every build, successful or otherwise.

In the event of a crash, the lockfile will not be removed. This prevents rbuild from running again and again, only to fail on the same error, seriously overloading the server. You will need to remove the lockfile to run it again.

Updates and Forced Rebuilds

Updates and Forced Rebuilds

When rbuild runs, one of the first things it does is determine if the version number in the local config file has changed. Under normal circumstances, the build will only continue if it has changed.

There are two ways to force an update if the version has not changed. The first is to use the -f (or --force) option. This requires a tag name to be supplied as in:

rbuild.pl -f zelig_1-0-1 zelig

This tag forces a build of the Zelig version 1.0.1. If you want the latest version, substitute the word LAST for version number:

rbuild.pl -f zelig_LAST zelig

The second way to trigger an update is to create a file in /nfs/projects/r/rbuild/lock named force-PACKAGE, where PACKAGE is the package name. If the file is empty, then the latest version is used. Otherwise, the content of the file uses a CVS tag. For example, to force an update of the latest Zelig release:

touch /nfs/projects/r/rbuild/lock/force-zelig

To force an update of version 1.0.1 of Zelig:

echo "zelig_1-0-1" > /nfs/projects/r/rbuild/lock/force-zelig

Note: Creating a force file only starts the build if the project is configured for autobuild. If the project is not configured for autobuild, you must run rbuild from the command line.

Bugfix Builds

Bugfix Builds

In addition to forced builds, bugfix build also ignores the project lock files in /nfs/projects/r/rbuild/lock. Bugfix builds (like forced builds) are based on a CVS tag. In most cases, this tag is a CVS branch.

There also are two ways to initiate a bugfix build. The first is with the -b (or --bugfix) option. This requires a tag name to be supplied, as in the following:

rbuild.pl -b zelig_1-0-1b zelig

This command builds the CVS branch zelig_1-0-1b (presumably a bugfix of zelig_1-0-1). Unlike forced builds, you must use a valid CVS tag because the LAST keyword has no meaning for bugfix builds.

The second way to initiate a bugfix build is by creating a file in /nfs/fs1/projects/rbuild/lock of the form bufix-PKG, where PKG is the name of the package you want to build. This file must contain a valid CVS tag. The following example causes rbuild to build from zelig_1-0-1b branch on the next build:

echo "zelig_1-0-1b" > /nfs/projects/r/rbuild/lock/bugfix-zelig

Note: Creating a bugfix file only starts the build if the project is configured for autobuild. If the project is not configured for autobuild, you must run rbuild from the command line.

Version Numbers

Version Numbers

The version number used in the rbuild.conf file can be anything you like. Traditionally, we use a three-tiered number to indicate major, minor, and tiny changes, which might look like 5.6.1 or 5.6-1 and possibly includes letters to indicate very small changes.

CVS tags cannot contain dot characters ("."), so replace them with dashes ("-"). Version 5.6.1 is represented as a CVS tag by the string 5-6-1. We recommend the following pattern:

PACKAGE_MAJOR-MINOR-TINY

In this pattern, PACKAGE is the package name with initial cap, MAJOR is the major version number, MINOR is the minor version number, and TINY is the third tier version number.

Options

Options

You can use the following command line options with rbuild.pl:

  • -c USER, --cvsuser USER
    Check out files as USER (default is you).
  • -C, --Clean
    Clean out temporary files before and after running.
  • -D, --Debug
    Debug mode, equivalent to combination of -f, -K, and -v.
  • -h, --help
    Print this usage message and quit.
  • -f, --force TAG
    Force build of package. TAG is a CVS tag consisting of package name and version. If TAG contains LAST, then the latest version is used.
  • -b, --bugfix TAG
    Initiate a bugfix build with CVS tag TAG.
  • -K, --Keep
    Keep build files in temporary directory around when done.
  • -s HOST, --srchost HOST
    Check out files from HOST (default is where).
  • -v, --verbose
    Output verbose messages.
  • -V, --Versions
    Print version numbers and quit.
  • -R, --ReplaceConfig
    Use specified config file.
Text Replacement

Text Replacement

The rbuild.pl program can perform text replacement in selected files for you to update certain information, such as version number. The following element in the local config file tells rbuild to update the files zelig.tex and other.tex in the docs directory of the zelig project:

<path role="replace">docs/zelig.tex, docs/other.tex</path>

Inside these files, add a commented line just before the place to make the change. The line must contain the following pattern:

rbuild: replace 'A' 'B' C

In this pattern, A is the left delimiter, B is the right delimiter, and C is the type of information to insert. Everything between the left delimiter and the right relimiter (not inclusive) is thrown away and replaced with new information. Right now, the only value of C that is valid is version.

For example, consider this line from a TeX file:

\dateVersion 1.0\\ \today

To replace 1.0 with whatever the current version number is, add this line right before it:

% rbuild: replace 'Version ' '\\' version
\dateVersion 1.0\\ \today

Now rbuild automatically alters the file with that information before it builds the documentation.

Note: Rbuild does not commit the new file to the CVS repository. Thus, the line always appears as it does above.

IQSS