Monthly Archives: July 2013

How does Ceph backfilling pushes objects to replicas ?

When a placement group start backfilling it will ask the OSD be queued for recovery. It will eventually be processed and the OSD will ask it to start the recovery operations. Since it is backfilling (this is the original reason … Continue reading

Posted in Code path, ceph | Leave a comment

Ceph RBD live resize with krbd

The Ubuntu precise 3.8 linux kernel is rebuilt with Laurent Barbe’s patch: apt-get source linux-image-3.8.0-27-generic cd linux-lts-raring-3.8.0/ curl | patch -p1 dpkg-buildpackage -uc -us and installed with dpkg -i ../linux-image-3.8.0-27-generic_3.8.0-27.40~precise3_amd64.deb Live resize of a mounted ext4 file system can … Continue reading

Posted in ceph | 2 Comments

Ceph release name : Kraken

A few steps away from Galerie Deborah Zafman, after turning rue des gravilliers in Paris, France I looked up and saw a the painting of a Kraken and thought it could be the name of a Ceph release in 2015, … Continue reading

Posted in ceph | Leave a comment

Ceph replication vs erasure coding

Ceph implements resilience thru replication. An erasure coded backend is being worked on. The following diagram compares the two and is hopefully somewhat self explanatory. It was created in the context of the the Ceph BOF at OSCON and is … Continue reading

Posted in ceph | 1 Comment

Threads and unit tests in Ceph

To assert that a tested method calls Cond::Wait it is run in a separate Thread. The calling googletest function uses the same Mutex to assert that the child thread is waiting as expected. For instance, the SharedPtrRegistry::lookup method will Cond::Wait … Continue reading

Posted in ceph | Leave a comment

Ceph early adopter : Université de Nantes

The Université de Nantes started using Ceph for backups early 2012, before the Bobtail was released or Inktank founded. The IRTS department, under the lead of Yann Dupont, created a twelve nodes Ceph cluster to store backups. It contains the … Continue reading

Posted in ceph, debian, openstack | 8 Comments

Anatomy of ObjectContext, the Ceph in core representation of an object

An ObjectContext is created when a ReplicatedPG applies operations on an object. read/write mutual exclusion The C_OSD_OndiskWriteUnlock callback is registered to be called after a transaction (read in this case) completes. It will signal the writes and reads waiting if … Continue reading

Posted in Code path, ceph | Leave a comment

How does AccessMode controls read/write processing in Ceph ?

An operation ( read, write etc. ) may be added to the mode.waiting queue if the ReplicatedPG::AccessMode does not allow of it, yet. For instance, if an operation may_write but AccessMode::try_write finds the current state to be RMW_FLUSHING, it will … Continue reading

Posted in Code path, ceph | Leave a comment

How does a Ceph OSD handle a write message ? (up to Emperor)

When an OSD handles an operation is queued to a PG, it is added to the op_wq work queue ( or to the waiting_for_map list if the queue_op method of PG finds that it must wait for an OSDMap ) … Continue reading

Posted in Code path, ceph | Tagged | 1 Comment

How do Ceph placement groups use Watch ?

ReplicatedPG::prepare_transaction ( which is called when handling a message ) will call ReplicatedPG::do_osd_op_effects if the message is handled successfully. do_osd_op_effects iterates over ReplicatedPG::watch_connects which is set with a watcher including a cookie, a timeout ( hard coded to 30 seconds … Continue reading

Posted in Code path, ceph | Leave a comment