Welcome to the Free Software contributions diary of Loïc Dachary. Although the posts look like blog entries, they really are technical reports about the work done during the day. They are meant to be used as a reference by co-developers and managers. Erasure Code Patents StreamScale.

An example of controlled technical debt

When I started working to help with Ceph backports, I was not familiar with the workflow (who does what, when and why) or the conventions (referencing commits from redmine issues, the redmine backport field, …). I felt the need for scripts to help me cross reference information (from git, github and redmine) and consolidate them into an inventory which I could use as a central point to measure progress and find what needed to be done. But I was not able to formulate this in so many words and at the beginning it was little more than a vague feeling that I would quickly be lost if I did not write down my findings. I chose to write a script, with no tests and no structure, to do things like matching a pull request with a redmine issue when the only clue was a Fixes: #XXX embedded in the comment one of the commits.

After a few weeks the script grew into a 500 lines monstrosity, extremely useful and quite impossible to maintain in the long run. My excuse was that I had no clue what I needed to begin with and that I could not have understood the backport workflow without this script. After the first backport release was declared ready, I stopped adding functionalities and re-started from scratch what became the ceph-workbench backport sub command.

This refactor was done without modifying the behavior of the original script (there were only a few occurrences where it was impossible to preserve). The architecture of the script was completely new: the original script was a near linear sequence of operations with only global variables. The quick summary is that the script pulls information from a few sources (one class for redmine, one for gitlab, one for git), cross reference them with ad-hoc methods and display them into rdoc pages to be displayed in the wiki.

Writing unit tests helped proceed incrementally, pulling one code snippet after the other and checking they were not broken by the refactor. Instead of unit testing the top level command, integration tests were written and run via tox, using real gitlab and redmine instances as fixtures running in docker containers. It will help when adding new use cases such as scrapping the ceph-qa mailing list to match teuthology job failures with the corresponding redmine issue or interpreting the Backport: field in commit messages.

Posted in Uncategorized | Leave a comment

How was a cherry-pick conflict resolved ?

When a git cherry-pick fails because of a conflict, it can be resolved and committed. The reviewer is reminded that a conflict had to be resolved by the Conflicts section at the end of the message body:

commit 7b8e5c99a4a40ae788ad29e36b0d714f529b12eb
Author: John Spray 
Date:   Tue May 20 16:25:19 2014 +0100
...
    Signed-off-by: John Spray 
    (cherry picked from commit 1d9e4ac2e2bedfd40ee2d91a4a6098150af9b5df)
    Conflicts:
    	src/crush/CrushWrapper.h

The difference between the original commit and the cherry-picked commit including the conflict resolution can be displayed with:

commit=7b8e5c99a4a40ae788ad29e36b0d714f529b12eb
picked_from=$(git show --no-patch --pretty=%b $commit  |
  perl -ne 'print if(s/.*cherry picked from commit (\w+).*/$1/)')
diff -u --ignore-matching-lines '^[^+-]' \
   < (git show $picked_from) <(git show $commit)

Continue reading

Posted in git | Leave a comment

Script to enable redmine REST API

When redmine is installed in a container (as a test fixture for instance) with

$ docker run --name=redmine -d -p 10080:80 \
  -v $(pwd)/data/redmine:/home/redmine/data \
  -v /var/run/docker.sock:/run/docker.sock \
  -v $(which docker):/bin/docker  sameersbn/redmine:2.6.1

the following script can be used to enable the REST API with redmine-enable-rest-api.py http://localhost:10080 admin admin

import sys
import re
import requests

def params(page):
    (csrf_token,)=re.findall(r'meta content="(.*?)" name="csrf-token"', page.text)
    (csrf_param,)=re.findall(r'meta content="(.*?)" name="csrf-param"', page.text)
    return {csrf_param:csrf_token}

def enable_rest_api(url, user, password):
    s = requests.Session()

    p = params(s.get(url + '/login'))
    p.update({"username":user, "password":password})
    s.post(url + '/login', params=p)

    p = params(s.get(url + '/settings?tab=authentication'))
    p.update({'settings[rest_api_enabled]':'1'})
    s.post(url + '/settings/edit?tab=authentication', params=p)

enable_rest_api(*sys.argv[1:])
Posted in redmine | Leave a comment

How to display the commits from a merged branch ?

A branch gmock was proposed as pull request 483 on GitHub, accepted, merged into master and deleted. It had two commits:

  • bf05ec1 tests: replace existing gtest 1.5.0
  • 5cbe0c5 gmock: use Google C++ Mocking

In GitHub, the reference pull/483/head is preserved and points to the last commit of the branch that no longer exists.

    master +
           |
    849afe + merged and deleted branch gmock
           |\
    d5cb91 + \
           |  + bf05ec1 tests: replace existing gtest 1.5.0 (pull/483/head)
    5301b2 +  |
           |  + 5cbe0c5 gmock: use Google C++ Mocking
           | /
           |/
    5a6549 +
           |
           .
           .
           .

To list the commits that were merged we need to find the commit in master that is immediately after the former branch started.

base=$(git rev-list --topo-order master ^pull/483/head | tail -1)

The base variable contains 5301b2 which is the first commit of master that is not reachable from pull/483/head, in topological order instead of the default chronological order.

$ git rev-list --oneline $base^..pull/483/head
bf05ec1 tests: replace existing gtest 1.5.0
5cbe0c5 gmock: use Google C++ Mocking

Displays the commits of the former gmock branch. Note the ^ after $base that means the first parent of $base. If the graph is as shown above, it does not make a difference. But if it is as follows:

    master +
           |
    849afe + merged and deleted branch gmock
           |\
           | \
           |  + bf05ec1 tests: replace existing gtest 1.5.0 (pull/483/head)
           |  |
           |  + 5cbe0c5 gmock: use Google C++ Mocking
           | /
           |/
    5a6549 +
           |
           .
           .
           .

Then $base would be 849afe and git rev-list –oneline $base..pull/483/head (without the ^) would display nothing because pull/483/head is reachable from $base. Since $base^ is the first parent (i.e. the left parent on the graph above), it is 5a6549 and we get the desired result.

Posted in git | Leave a comment

retrieve github pull requests in JSON

The following python function returns a map associating each pull request number to its JSON description for the given repo. The OAuth token is needed so github will allow more requests to be processed during a given time frame. The result is cached in a file and refreshed every 24 hours.

import urllib2
import json
import re
import os
import time

def get_pull_requests(repo, token):
    # https://developer.github.com/v3/pulls/#list-pull-requests
    pulls_file = "/tmp/pulls.json"
    if ( not os.access(pulls_file, 0) or
         time.time() - os.stat(pulls_file).st_mtime > 24 * 60 * 60 ):
        pulls = {}
        url = ("https://api.github.com/repos/" + repo +
               "/pulls?state=all&access_token=" + token )
        while url:
            github = urllib2.Request(url=url)
            f = urllib2.urlopen(github)
            for pull in json.loads(f.read()):
                pulls[pull['number']] = pull
            url = None
            for link in f.info()['Link'].split(','):
                if 'rel="next"' in link:
                    m = re.search('<(.*)>', link)
                    if m:
                        url = m.group(1)
        with open(pulls_file, 'w') as f:
            json.dump(pulls, f)
    else:
        with open(pulls_file, 'r') as f:
            pulls = json.load(f)
    return pulls

For instance

pulls = get_pull_requests('ceph/ceph', '64933d355fda984108b4aad2c5cd4c4a224aad')

The same pagination logic applies to all API calls (see Web Linking RFC 5988 for more information) and parsing could use the LinkHeader module instead of rudimentary regexp parsing.

Posted in Uncategorized | Leave a comment

gf-complete test coverage report

The make coverage target is added to gf-complete to create a lcov report while running the tests with make check.

The full report is archived.

Posted in jerasure | Leave a comment

Teuthology docker targets hack (4/4)

The teuthology container hack is completed by adding a flag to retrieve packages from a user specified repository instead of gitbuilder.ceph.com. The user can build packages from sources and run a job, which will implicitly save a docker image with the package installed. The second time the same job is run, it will go faster because it reuses the image. For instance the following job:

machine_type: container
os_type: ubuntu
os_version: "14.04"
suite_path: /home/loic/software/ceph/ceph-qa-suite
roles:
- - mon.a
  - osd.0
  - osd.1
  - client.0
overrides:
  install:
    ceph:
      branch: master
  ceph:
    wait-for-scrub: false
tasks:
- install:
    repository_url: http://172.17.42.1/trusty
- ceph:

runs under one minute:

{duration: 47.98, flavor: basic, owner: loic@dachary.org, success: true}
Posted in ceph, docker | 3 Comments

Building Ceph Debian GNU/Linux packages

The following script explains how to create Debian GNU/Linux packages for Ceph from a clone of the sources.

releasedir=/tmp/release
rm -fr releasedir
mkdir -p $releasedir
#
# remove all files not under git so they are not
# included in the distribution.
#
git clean -dxf
#
# git describe provides a version that is
# a) human readable
# b) is unique for each commit
# c) compares higher than any previous commit
# d) contains the short hash of the commit
#
vers=`git describe --match "v*" | sed s/^v//`
#
# creating the distribution tarbal requires some configure
# options (otherwise parts of the source tree will be left out).
#
./autogen.sh
./configure --with-rocksdb --with-ocf --with-rest-bench \
    --with-nss --with-debug --enable-cephfs-java \
    --with-lttng --with-babeltrace
#
# use distdir= to set the name of the top level directory of the
# tarbal to match the desired version
#
make distdir=ceph-$vers dist
#
# rename the tarbal to match debian conventions and extract it
#
mv ceph-$vers.tar.gz $releasedir/ceph_$vers.orig.tar.gz
tar -C $releasedir zxf ceph_$vers.orig.tar.gz
#
# copy the debian directory over and remove -dbg packages
# because they are large and take time to build
#
cp -a debian $releasedir/ceph-$vers/debian
cd $releasedir
perl -ni -e 'print if(!(/^Package: .*-dbg$/../^$/))' ceph-$vers/debian/control
perl -pi -e 's/--dbg-package.*//' ceph-$vers/debian/rules
#
# always set the debian version to 1 which is ok because the debian
# directory is included in the sources and the upstream version will
# change each time it is modified.
#
dvers="$vers-1"
#
# update the changelog to match the desired version
#
cd ceph-$vers
chvers=`head -1 debian/changelog | perl -ne 's/.*\(//; s/\).*//; print'`
if [ "$chvers" != "$dvers" ]; then
   DEBEMAIL="contact@ceph.com" dch -b -v "$dvers" "new version"
fi
#
# create the packages
# a) with ccache to speed things up when building repeatedly
# b) do not sign the packages
# c) use half of the available processors
#
PATH=/usr/lib/ccache:$PATH dpkg-buildpackage -j$(($(nproc) / 2)) -uc -us

Continue reading

Posted in ceph, debian | 1 Comment

jerasure.org installation notes

The jerasure.org is setup to host the upstream repositories for the GF-complete and jerasure libraries. Contributors may sign-up or re-use their existing GitHub account. A companion continous integration server runs make check on each merge request.
Continue reading

Posted in gitlab, jerasure | Leave a comment

Customizing the gitlab home page

The customization of the Gitlab home page is a proprietary extension that is not available in the Free Software version. When running Gitlab from docker containers, the home page template needs to be moved to a file that won’t go away with the container:

$ layouts=/home/git/gitlab/app/views/layouts/
$ docker exec gitlab mkdir -p /home/git/data/$layouts
$ docker exec gitlab mv $layouts/devise.html.haml /home/git/data/$layouts
$ docker exec gitlab ln -s /home/git/data/$layouts/devise.html.haml \
   $layouts/devise.html.haml

The template can now be modifed in /opt/gitlab/data/home/git/gitlab/app/views/layouts/ from the host running the container. It is a HAML template which can have raw HTML as long as proper indentation is respected.

Continue reading

Posted in gitlab | 1 Comment