HOWTO extract a stack trace from teuthology (take 1)

When a teuthology test suite fails on Ceph, it shows in pulpito. For instance there is one failure in the monthrash test suite with details and a link to the logs. By removing the teuthology.log part of the link a directory listing shows all informations archived for this run are available.
In the example above the logs show:

client.0.plana34.stderr:+ ceph_test_rados_api_io
client.0.plana34.stdout:Running main() from gtest_main.cc
client.0.plana34.stdout:[==========] Running 43 tests from 4 test cases.
client.0.plana34.stdout:[----------] Global test environment set-up.
client.0.plana34.stdout:[----------] 11 tests from LibRadosIo
client.0.plana34.stdout:[ RUN      ] LibRadosIo.SimpleWrite
client.0.plana34.stdout:[       OK ] LibRadosIo.SimpleWrite (1509 ms)
client.0.plana34.stdout:[ RUN      ] LibRadosIo.ReadTimeout
client.0.plana34.stderr:Segmentation fault (core dumped)

That shows ceph_test_rados_api_io is running from the plana34 machine and core dumped and the remote/plana34/coredump subdirectory contains the corresponding core dump.
The teuthology logs show the repository from which the binary was downloaded (it was produced by gitbuilder).

echo deb http://gitbuilder.ceph.com/ceph-deb-precise-x86_64-basic/sha1/f5c1d3b6988bae5ffb914d2ac0b2858caeffe12c precise main | sudo tee /etc/apt/sources.list.d/ceph.list

and running this line on an Ubuntu precise 12.04 64bits as suggested by the name of the subdirectory precise-x86_64 will make the corresponding binary packages available. It is also possible to download them directly from the pool/main/c/ceph subdirectory. The packages that are suffixed with -dbg retain the debug symbols that are necessary for gdb to display an informative stack trace.
The ceph_test_rados_api_io binary is part of the ceph-test package and can be extracted with

$ dpkg --fsys-tarfile ceph-test_0.85-726-gf5c1d3b-1precise_amd64.deb | \
  tar xOf -  ./usr/bin/ceph_test_rados_api_io \
  > ceph_test_rados_api_io

and the stack trace displayed with

$ gdb /usr/bin/ceph_test_rados_api_io 1411176209.8835.core
(gdb) bt
#0  0x00007f541b95750a in pthread_rwlock_wrlock () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f541bd41341 in RWLock::get_write(bool) () from /usr/lib/librados.so.2
#2  0x00007f541bd2bbc9 in Objecter::op_cancel(Objecter::OSDSession*, unsigned long, int) () from /usr/lib/librados.so.2
#3  0x00007f541bcf1349 in Context::complete(int) () from /usr/lib/librados.so.2
#4  0x00007f541bdad5ea in RWTimer::timer_thread() () from /usr/lib/librados.so.2
#5  0x00007f541bdb149d in RWTimerThread::entry() () from /usr/lib/librados.so.2
#6  0x00007f541b953e9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#7  0x00007f541b16a3fd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#8  0x0000000000000000 in ?? ()
This entry was posted in ceph. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>