In this scenario I needed to reinstall a Proxmox node with Ceph OSD. I did not want to rebuild the SSD In this example Proxmox 6.3, Ceph Octopus (15.2)
- Remove the node from Ceph monitors and managers
pveceph mon destroy pvetest3
pveceph mgr destroy pvetest3
-
Now re-install the node
-
After installation finishes add the node back to the cluster (need to remove the old one first)
Remove it first (from another node on the cluster)
pvecm delnode pvetest3
Now add back (from the node itself, refer to an existing node in the cluster)
pvecm add 10.255.202.65
- Install the ceph
pveceph install -version octopus
Reboot the node just in case Then I needed to make a link to ceph configuration file
ln -s /etc/pve/ceph.conf /etc/ceph/ceph.conf
Now ‘ceph -s’ should show the information
- Activate the OSD (automatic option) The OSD still down. Need to activate the existing OSD
ceph-volume lvm activate --all
In case the OSD was created by an earlier version of Ceph, it could be within a simple volume. In that case activation is like that:
ceph-volume simple activate --all
That should create the OSD within ceph inventory and you should be able to start it using systemctl start ceph-osd@<id>
command
Activate OSD (manual option)
The below procedure is the one that I used before I got to know the --all
option. Saved just for reference
Find out the required information:
root@pvetest3:/var/log/ceph# ceph-volume inventory
Device Path Size rotates available Model name
/dev/sda 12.00 GB True False QEMU HARDDISK
/dev/sdb 10.00 GB True False QEMU HARDDISK
root@pvetest3:/var/log/ceph# ceph-volume inventory /dev/sdb
====== Device report /dev/sdb ======
path /dev/sdb
available False
rejected reasons Insufficient space (<5GB) on vgs, LVM detected, locked
device id QEMU_HARDDISK_drive-scsi1
removable 0
ro 0
vendor QEMU
model QEMU HARDDISK
sas address
rotational 1
scheduler mode mq-deadline
human readable size 10.00 GB
--- Logical Volume ---
name osd-block-a1d45662-bacd-40b5-8731-c240dec53ebd
osd id 2
cluster name ceph
type block
osd fsid a1d45662-bacd-40b5-8731-c240dec53ebd
cluster fsid 2bcb5b34-973f-4496-919c-9f3bdd02da4b
osdspec affinity
block uuid n1xad2-ZMCr-EC0C-st8f-NaE1-wd8k-pHmvgx
Use the osd id
(2 in this example) and osd fsid
to re-create the osd:
root@pvetest3:/var/log/ceph# ceph-volume lvm activate --bluestore 2 a1d45662-bacd-40b5-8731-c240dec53ebd
Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-2
--> Executable selinuxenabled not in PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-2
Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-388d0ef5-3a4f-487e-8348-1eff6361a4bb/osd-block-a1d45662-bad-40b5-8731-c240dec53ebd --path /var/lib/ceph/osd/ceph-2 --no-mon-config
Running command: /usr/bin/ln -snf /dev/ceph-388d0ef5-3a4f-487e-8348-1eff6361a4bb/osd-block-a1d45662-bacd-40b5-8731-c240dec53ebd /var/lib/ceph/osd/cep-2/block
Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-2/block
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-0
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-2
Running command: /usr/bin/systemctl enable ceph-volume@lvm-2-a1d45662-bacd-40b5-8731-c240dec53ebd
stderr: Created symlink /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-2-a1d45662-bacd-40b5-8731-c240dec53ebd.service → /lib/systemd/sytem/ceph-volume@.service.
Running command: /usr/bin/systemctl enable --runtime ceph-osd@2
stderr: Created symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd@2.service → /lib/systemd/system/ceph-osd@.service.
Running command: /usr/bin/systemctl start ceph-osd@2
--> ceph-volume lvm activate successful for osd ID: 2
In my case that pretty much recovered the OSD