Friday, 9 July 2010

Building src.rpm from git/svn

Here are some scripts which I use to produce a src.rpm from a git tree.


In accordance with rpm philosphy, a pristine source tar.gz is produced, with a series of patches up to the current released. The svn backend is less developed, as I am not so familiar with svn as I am with git, but it works enough for my use with the openchange project.

The general usage is:
make-srpm [options] project.spec [...]
and this should be invoked from within or below the git/svn checkout directory.


Note: make-srpm will cd to the top-level git/svn directory before working. Multiple .spec files can be specified, and these can be absolute paths or relative to the current directory (not the top level directory).


make-srpm will generate a new .spec file based on the on you provide, and then use this spec file along with the specified environment it will build the binary RPMS.


make-srpm creates a folder called dist-export in the top level git directory.


Into this folder it will export the pristine source as specified by the RELEASE variable, along with any other files in the $EXTRA_SOURCE directory (relative to the spec file).


Numbered patches are then produced, starting after the highest patch already in the spec file.


dist-export then contains everything required, and the src.rpm is build using rpmbuild --nodeps

Options and Macros

The operation of make-srpm is generally controlled by shell variables which may be set explicitly in the environment before calling make-srpm, or specified as command arguments in makefile fashion; e.g
make-srpm ORIGIN_PATTERN="*alpha*" project.spec
or from within the specfile from specially formed comments.


The Name: and Version: fields are always extracted and stored in the environment variables Name and Version.

The following comments are also searched:

#make-srpm-SOURCE_PATHS:
#make-srpm-ORIGIN:
#make-srpm-ORIGIN_PATTERN:
#make-srpm-RELEASE:
#make-srpm-PATCH_LEVEL:


to set environment variables named without the lower case prefix. The meaning of these is explained below:

SOURCE_PATHS - this is a space-separated list of directories relative to the top level git folder, and specifies the folders to be exported. Some git trees contain source for more than one project and so it can be convenient to reduce the size of the source tar.gz by putting a line like this in the sub-project spec file:
#make-srpm-SOURCE_PATHS: sub_project_dir includes
ORIGIN - this is a git reference to the commit that represents the pristine source and should probably be the most recent tag before the git rebase point. This can be a tag or a commit hash.

ORIGIN_PATTERN - more useful than origin, can be a git tag glob so that make-srpm can select the current ORIGIN automatically. It can be convenient to have a line in a spec file like:
#make-srpm-ORIGIN_PATTERN=*release*
RELEASE - is a git reference to the release that you want to build; thus patches will be emitted from ORIGIN to RELEASE. This can be a tag or a commit hash, or the value HEAD (meaning whatever is currently checked out), or the special value LOCAL which means that uncommitted changes are also included.

PATCH_LEVEL - indicates the value of the -p argument that will need passing to patch and depends on what level of subdirectory your source sits at from the top level git directory. As the value is not likely to change for a project, it can be convenient to add a line like this to a spec file:
#make-srpm-PATCH_LEVEL=2
The new spec file will also contain some convenient macro definitions at the top of the file, like:
%define makesrpm_tarname
samba-release-4-0-0alpha7.tar.gz

%define makesrpm_tarprefix samba-release-4-0-0alpha7
which you can use as a basis for your own macro definitions or package fields.


make-srpm will also allow you to pass your own spec file macro definitions. Macros may be literal macros, or interpolated macros which are be defined in terms of values calculated during execution. This is similar to defining evironment variables, except that define_ or _define_ is prefixed to the name.

e.g.
make-srpm define_project_builder=sam@liddicott.com
\
          _define_project_version='$VERSION' \
          CFLAGS="-O2" \
         
project.spec
will replace the macro definitions project_version and project_builder if they exist, or define them at the top of the spec file, like this:
%define project_version 4
%define project_builder sam@liddicott.com
Note that project_version was specified as an interpolated macro by being prefixed with _define_, literal macros are prefixed with define_, without the leading underscore.

Variables suitable for interpolated definitions are:

LOCAL - set to 1 if un-committed changes are included

NAME - specified name, defaults to that found in spec file

VERSION - specified version, defaults to that found in spec file

RELEASE - user supplied reference to git commit to build

RELEASE_COMMIT - git hash of RELEASE

RELEASE_TAG - latest tag leading up to RELEASE commit

SPEC_RELEASE - part of the output of git describe or
VERSION_INFO, useful for the Release: field in the spec file
ORIGIN_COMMIT - git hash of commit being exported to tar.gz

VERSION_INFO - output of: git show --pretty=format:"%h %ct %H %cd" $RELEASE_COMMIT

TAR_NAME - name of tar file

TAR_PREFIX - path prefix to files in tar.gz; probably cd %TAR_PREFIX is needed in .spec file


So we see that the action of make-srpm may be configured however is convenient either by special comments in the spec file, or perhaps with makefile arguments name=value from a makefile or other build script.

Implementation Notes

git rev-parse is used to sanitize external git references in this manner:
RELEASE_COMMIT="$(git rev-parse "$RELEASE^{commit}")"

awk

awk is used for enormous speed in some cases.

git_linearize is implemented in awk and solves the problem that the output of git format-patch A..B will often fail when applied to A if there have been merges. git_linearize will instead take the longest path from A..B and then perform git diff between successive points in the path.

git_fix_empty_files is implemented in awk. It processes all the outputs of git diff in order to cope with patches generated by git diff that patch cannot handle; including addition or removal of empty files and meta-data changes. Some of the metadata changes probably ought to generate shell commands to be executed once the patches have been applied.

git_check_empty_patch used to be grep '^' which returns
non-zero on empty files while passing through the entire file. Because
some commits only have meta-data changes, we need to detect files
without any hunks while still passing the entire output, so awk is
used.