Among the core functions of my homelab is a storage environment based on Ceph. For months I’ve been looking for, buying, and preparing new hardware and a server rack for an update to my lab. For the last week, I’ve been moving data from the old nodes to the new nodes. Today there was enough data moved to completely shutdown one old node and transfer the hard drives into the new machines. These are my notes of cleaning the drive partitions, preparing the flash device partitions, and adding the OSDs to the new cluster.
Wipe The Drives
I shutdown the old node and pulled the hardware, without removing any data from the old drives – just in case there was a need to restore something to the old cluster; luckily that was not the case and I moved forward with wiping the drives using the following commands.
root@titan:~# wipefs -a /dev/sdc
/dev/sdc: 8 bytes were erased at offset 0x00000218 (LVM2_member): 4c 56 4d 32 20 30 30 31
Check For LVM Related Data
Some of my old drives were already using LVM and BlueStore, if you try to prepare an old drives that had any PV (physical volume) or LV (logical volume) data then the ceph-volume prepare command will fail with something similar to this:
root@europa:~# ceph-volume lvm prepare --bluestore --dmcrypt --data /dev/sdc --block.wal /dev/fioa3 --block.db /dev/fioa4
...
stderr: Physical volume '/dev/sdc' is already in volume group 'ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4'
Unable to add physical volume '/dev/sdc' to volume group 'ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4'
/dev/sdc: physical volume not initialized.
...
--> RuntimeError: command returned non-zero exit status: 5
Remove LVM Related Data
When you need to remove LVM data from the drive you’ll find the use of pvdisplay (to get the VG name) and vgremove are the easiest ways to solve the problem. Make sure you are looking at the correct device, I shortened the output below.
root@europa:~# pvdisplay
...
PV Name /dev/sdc
VG Name ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4
PV Size <7.28 TiB / not usable <1.34 MiB
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 1907721
Free PE 0
Allocated PE 1907721
PV UUID LIe071-C7gV-q1tq-iAAb-3V3p-ZA3i-3VEJZX
Then remove the PV and LV using the following and confirming that you want to remove the physical and logical volume.
root@europa:~# vgremove ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4
Do you really want to remove volume group "ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4" containing 1 logical volumes? [y/n]: y
Do you really want to remove and DISCARD active logical volume ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4/osd-block-404a4208-0d30-4b9a-a7a1-87a1898e924b? [y/n]: y
Logical volume "osd-block-404a4208-0d30-4b9a-a7a1-87a1898e924b" successfully removed
Volume group "ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4" successfully removed
Prepare the WAL and DB Devices
I was lucky enough to get my hands on some cheap IOFusion devices (these are EOL (End of Life) so using them in a production cluster would not be recommended. That warning aside, these drives are awesome and are sized just about right for my cluster. I used gdisk to prepare new partitions (1GB for the DB (metadata) portion of the device and 80GB for the WAL portion (roughly 10% of the storage device).
root@ganymede:~# gdisk /dev/fioa
GPT fdisk (gdisk) version 1.0.3
Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present
Found valid GPT with protective MBR; using GPT.
Command (? for help): n
Partition number (3-128, default 3):
First sector (6-244140619, default = 20971776) or {+-}size{KMGTP}:
Last sector (20971776-244140619, default = 244140619) or {+-}size{KMGTP}: +1G
Current type is 'Linux filesystem'
Hex code or GUID (L to show codes, Enter = 8300):
Changed type of partition to 'Linux filesystem'
Command (? for help): n
Partition number (4-128, default 4):
First sector (6-244140619, default = 21233920) or {+-}size{KMGTP}:
Last sector (21233920-244140619, default = 244140619) or {+-}size{KMGTP}: +80G
Current type is 'Linux filesystem'
Hex code or GUID (L to show codes, Enter = 8300):
Changed type of partition to 'Linux filesystem'
Command (? for help): x
Expert command (? for help): c
Partition number (1-4): 3
Enter the partition's new unique GUID ('R' to randomize): R
New GUID is 302BDE02-F625-4B33-80F5-5EE0254AADB9
Expert command (? for help): c
Partition number (1-4): 4
Enter the partition's new unique GUID ('R' to randomize): R
New GUID is 2F4EF305-A7BA-42E0-B690-3D3CDCF28B29
Expert command (? for help): w
Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
PARTITIONS!!
Do you want to proceed? (Y/N): y
OK; writing new GUID partition table (GPT) to /dev/fioa.
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot or after you
run partprobe(8) or kpartx(8)
The operation has completed successfully.
root@ganymede:~# partprobe /dev/fioa
A quick note about the above. Notice that I dropped into expert (x) mode and set a random GUID (c, then R) on each of the new partitions. Be sure to run partprobe after you finish adding the new partitions and their new GUID.
Prepare and Activate the OSD
At this point all you should have to do is prepare the OSD.
root@ganymede:~# ceph-volume lvm prepare --bluestore --dmcrypt --data /dev/sdc --block.wal /dev/fioa3 --block.db /dev/fioa4
...
--> ceph-volume lvm prepare successful for: /dev/sdc
Then activate the OSD.
root@ganymede:~# ceph-volume lvm activate --all
...
--> ceph-volume lvm activate successful for osd ID: 4