Ceph relies on Paxos to maintain a quorum among monitor services so that they agree on cluster state. In some cases Ceph can lose quorum, such as when hosts are added and removed from the cluster in quick successtion, without removing the old hosts from Ceph (see Adding/Removing Hosts).
A telltale sign of quorum loss is when querying cluster health, ceph -s
times out with monitor
faults on every host in the cluster.
Important
Ceph refusing to do anything when it has lost quorum is a safety precaution to prevent you from losing data. Attempting to recover from this situation requires knowledge about the state of your cluster, and should only be attempted if data loss is not considered catastrophic (such as when a recent backup is available). When in doubt, consult the Ceph and Deis communities for assistance. Deis recommends regular backups to minimize impact should an issue like this occur. For more information, see Backing Up and Restoring Data.
The instructions below are intentionally vague, as each recovery scenario will be unique. They are intended only to point users in the right direction for recovery.
To recover from Ceph quorum loss:
ceph -s
shows nothing but timeouts and/or monitor faultsdeisctl stop platform
so components stop trying to write data to store (note that instead, manually stopping all components except router will allow application containers to remain up, unaffected)/deis/store/hosts
so that dead monitors are not written out to clients/deis/store/monSetupLock
to point to the healthy monitor – note that this isn’t strictly necessary, as this value is only used if wiping clean and starting a fresh cluster from scratch with no data, but it’s good cleanupmonmaptool --rm mon.<hostname> --clobber /etc/ceph/monmap
)deis-store-admin
to inject the prepared monmap into the monitor with ceph-mon -i <hostname> --inject-monmap /etc/ceph/monmap
ceph -s
and/or query mon_status on the admin socket)deisctl start store-daemon
ceph osd dump
– for each OSD that is no longer with us, follow Removing an OSD – take care to ensure that the data is relocated (watch the health with ceph -w
) before marking another OSD as out
deisctl start store-metadata
and deisctl start store-gateway
store-volume
with deisctl start store-volume
.deisctl start platform