Wainlux K6: Windows Driver and Application Download

First, let me start by saying just how much I dislike the software and website behind this machine; not only is finding the driver and software hard but the drivers fail to enable the USB to UART bridge and the company fails to redirect to working drivers. With that said here is what you need to make the device work on Windows 10.

  1. You’ll want the Silicon Labs USB to UART bridge driver.
  2. The jar file for their application and a copy of java which is bundled (!?): http://www.wainlux.com/download/K6Windows.rar

In order to run the application run cmd as administrator. I use Windows+R, then type cmd, and to run as administrator press Ctrl+Shift+Enter. Switch directory using cd C:\diao\bin\ and run the command java -jar diao.jar.

The application will start at this point and you should be able to work with the device properly. Good luck.

Activating Logical Volume on IO-Memory Device at Startup (Ubuntu 18.04 Bionic)

For awhile I’ve had a problem with OSDs living on an fio device failing to start automatically after a reboot. The logical volume sitting on the device was not starting, eventually I checked udev.

The line to modify lives in lives in /lib/udev/rules.d/60-persistent-storage.rules.

KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*|scm*|pmem*|nbd*", GOTO="persistent_storage_end"

add fio* to the end of the list:

KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*|scm*|pmem*|nbd*|fio*", GOTO="persistent_storage_end"

The Henderson Homelab

My Ceph cluster, and everything it runs exists on three Cisco UCS C240 M3’s.

Each node has a range of 10 to 12 WD Red HDD’s in each. There are two NVMe PCIe cards (1 Sandisk Fusion IOMemory SX350 & 1 Sandisk Fusion IOMemory PX600). Each node includes 128GB of RAM and are connected to two 10Gbps networks (physically independent switches) for public and cluster networks.

  • rbd: stored on HDD devices
  • rbd-ssd: stored on SSD devices
  • lxd: stored on SSD devices
  • k8s-rbd: stored on SSD devices

Libvirtd Unable to Connect when Using RBD Storage Pools

I ran across a problem recently where attempting to list virtual machines was taking ~45 minutes through virsh and virt-manager; it turns out that the problem was actually due to this patch in libvirt for using RBD fast-diff. In my case the ‘default’ storage pool is actually a link to my RBD storage pool. and that patch checks for the enabled feature but does not check the flags to see if the object-map and fast-diff are invalid

Good News Everyone!

There has been a recent patch that solves this. Unfortunately some distributions have not caught up with it yet (looking at you Ubuntu Bionic). Anyhow, this will hopefully make its way down the various streams that package libvirtd and the problem will be sorted.

Creating Ceph Bluestore OSDs with Spinning Drives and SSDs for DB/WAL

As a consultant I work with Ceph using a downstream version of the product; so once in awhile I like to catch up on new features and functions that have not yet hit the downstream/supported version of the product; that process has led me to setting up my homelab (again) and using Ceph Nautilus as a base for storage.

Using ceph-volume

Ceph comes with a deployment and inspection tool called ceph-volume. Much like the older ceph-deploy tool, ceph-volume will allow you to inspect, prepare, and activate object storage daemons (OSDs). The advantages of ceph-volume include support for LVM, dm-cache, and it no longer relies/interacts with udev rules.

For my use case I have installed a single Fusion IOMemory card unto each of my nodes in order to deploy OSDs with faster storage for the DB and WAL devices. It’s a very good idea to read the Bluestore configuration reference as that is default for new OSD deployments. Take careful note of the recommendations for the use of a DB and WAL device.

If there is only a small amount of fast storage available (e.g., less than a gigabyte), we recommend using it as a WAL device. If there is more, provisioning a DB device makes more sense. The BlueStore journal will always be placed on the fastest device available, so using a DB device will provide the same benefit that the WAL device would while also allowing additional metadata to be stored there (if it will fit).

Bluestore Configuration Reference

In my case, due to the access to the Fusion IOMemory card, I want to create enough partitions to support 11 OSDs and make them as large as possible for the DB device (which will put the WAL device on the same partition). My fast media is 931 GB of usable storage, if I split it evenly across all eleven OSDs I should end up with partitions ~84 GB in size. I like round numbers so those partitions are now 80 GB in size and the deployment command looks something like this.

root@ganymede:~# ceph-volume lvm prepare --bluestore --dmcrypt --data /dev/sdd --block.db /dev/fioa5

Be sure to replace the –data argument with the storage device and the –block.db argument needs to point to the partition on the fast storage you wish to use for the given OSD. After that I run the activation command for all OSDs on the node.

root@ganymede:~# ceph-volume lvm activate --all

Assuming everything has gone as expected the OSDs will start up and join the cluster and you’ll get all the speedy goodness of an SSD for the write ahead log and RocksDB.

Moving Drives From an Old Ceph Cluster to a New Ceph Cluster

Among the core functions of my homelab is a storage environment based on Ceph. For months I’ve been looking for, buying, and preparing new hardware and a server rack for an update to my lab. For the last week, I’ve been moving data from the old nodes to the new nodes. Today there was enough data moved to completely shutdown one old node and transfer the hard drives into the new machines. These are my notes of cleaning the drive partitions, preparing the flash device partitions, and adding the OSDs to the new cluster.

Wipe The Drives

I shutdown the old node and pulled the hardware, without removing any data from the old drives – just in case there was a need to restore something to the old cluster; luckily that was not the case and I moved forward with wiping the drives using the following commands.

root@titan:~# wipefs -a /dev/sdc
/dev/sdc: 8 bytes were erased at offset 0x00000218 (LVM2_member): 4c 56 4d 32 20 30 30 31

Check For LVM Related Data

Some of my old drives were already using LVM and BlueStore, if you try to prepare an old drives that had any PV (physical volume) or LV (logical volume) data then the ceph-volume prepare command will fail with something similar to this:

root@europa:~# ceph-volume lvm prepare --bluestore --dmcrypt --data /dev/sdc --block.wal /dev/fioa3 --block.db /dev/fioa4
 stderr: Physical volume '/dev/sdc' is already in volume group 'ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4'
  Unable to add physical volume '/dev/sdc' to volume group 'ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4'
  /dev/sdc: physical volume not initialized.
-->  RuntimeError: command returned non-zero exit status: 5

Remove LVM Related Data

When you need to remove LVM data from the drive you’ll find the use of pvdisplay (to get the VG name) and vgremove are the easiest ways to solve the problem. Make sure you are looking at the correct device, I shortened the output below.

root@europa:~# pvdisplay
  PV Name               /dev/sdc
  VG Name               ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4
  PV Size               <7.28 TiB / not usable <1.34 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              1907721
  Free PE               0
  Allocated PE          1907721
  PV UUID               LIe071-C7gV-q1tq-iAAb-3V3p-ZA3i-3VEJZX

Then remove the PV and LV using the following and confirming that you want to remove the physical and logical volume.

root@europa:~# vgremove ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4
Do you really want to remove volume group "ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4" containing 1 logical volumes? [y/n]: y
Do you really want to remove and DISCARD active logical volume ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4/osd-block-404a4208-0d30-4b9a-a7a1-87a1898e924b? [y/n]: y
  Logical volume "osd-block-404a4208-0d30-4b9a-a7a1-87a1898e924b" successfully removed
  Volume group "ceph-eebc4ef5-712b-4924-b70c-1df6269fc9a4" successfully removed

Prepare the WAL and DB Devices

I was lucky enough to get my hands on some cheap IOFusion devices (these are EOL (End of Life) so using them in a production cluster would not be recommended. That warning aside, these drives are awesome and are sized just about right for my cluster. I used gdisk to prepare new partitions (1GB for the DB (metadata) portion of the device and 80GB for the WAL portion (roughly 10% of the storage device).

root@ganymede:~# gdisk /dev/fioa
GPT fdisk (gdisk) version 1.0.3

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.

Command (? for help): n
Partition number (3-128, default 3):
First sector (6-244140619, default = 20971776) or {+-}size{KMGTP}:
Last sector (20971776-244140619, default = 244140619) or {+-}size{KMGTP}: +1G
Current type is 'Linux filesystem'
Hex code or GUID (L to show codes, Enter = 8300):
Changed type of partition to 'Linux filesystem'

Command (? for help): n
Partition number (4-128, default 4):
First sector (6-244140619, default = 21233920) or {+-}size{KMGTP}:
Last sector (21233920-244140619, default = 244140619) or {+-}size{KMGTP}: +80G
Current type is 'Linux filesystem'
Hex code or GUID (L to show codes, Enter = 8300):
Changed type of partition to 'Linux filesystem'

Command (? for help): x

Expert command (? for help): c
Partition number (1-4): 3
Enter the partition's new unique GUID ('R' to randomize): R
New GUID is 302BDE02-F625-4B33-80F5-5EE0254AADB9

Expert command (? for help): c
Partition number (1-4): 4
Enter the partition's new unique GUID ('R' to randomize): R
New GUID is 2F4EF305-A7BA-42E0-B690-3D3CDCF28B29

Expert command (? for help): w

Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING

Do you want to proceed? (Y/N): y
OK; writing new GUID partition table (GPT) to /dev/fioa.
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot or after you
run partprobe(8) or kpartx(8)
The operation has completed successfully.
root@ganymede:~# partprobe /dev/fioa

A quick note about the above. Notice that I dropped into expert (x) mode and set a random GUID (c, then R) on each of the new partitions. Be sure to run partprobe after you finish adding the new partitions and their new GUID.

Prepare and Activate the OSD

At this point all you should have to do is prepare the OSD.

root@ganymede:~# ceph-volume lvm prepare --bluestore --dmcrypt --data /dev/sdc --block.wal /dev/fioa3 --block.db /dev/fioa4
--> ceph-volume lvm prepare successful for: /dev/sdc

Then activate the OSD.

root@ganymede:~# ceph-volume lvm activate --all
--> ceph-volume lvm activate successful for osd ID: 4

Mounting CephFS From Multiple Clusters to a Single Machine using FUSE

For my new homelab cluster I’ve built up a fresh Ceph filesystem to store certain chunks of my data and found the need to mount both to one of my nodes. Normally I use ceph-fuse through /etc/fstab, so I simply modified with the following.

root@storage:~# grep fuse /etc/fstab
none	/mnt/storage/ceph	fuse.ceph	ceph.id=admin,ceph.conf=/etc/ceph/ceph.conf,_netdev,defaults  0 0
none	/mnt/storage/ceph-old	fuse.ceph	ceph.id=admin,ceph.conf=/etc/ceph-old/ceph.conf,_netdev,defaults  0 0

The /etc/ceph-old/ is a copy of my config files from the older cluster. In the /etc/ceph-old/ceph.conf file I added the following, since the keyring for the that cluster is not in the default path.

keyring = /etc/ceph-old/ceph.client.admin.keyring

Anytime the ceph.conf from the old cluster is used so is the old keyring and the cluster mounts up just fine.

Filesystem     Type            Size  Used Avail Use% Mounted on
ceph-fuse      fuse.ceph-fuse  100T   91T  9.4T  91% /mnt/storage/ceph-old

FreeIPA Certificates Displays CertificateOperationError

Working with a fresh install of FreeIPA using the Ubuntu Bionic package is displaying an error on the ‘Certificates’ page which reads:

IPA Error 4301: CertificateOperationError
Certificate operation cannot be completed: Unable to communicate with CMS (Start tag expected, '<' not found, line 1, column 1)

After doing some research on the problem it seems to have already been resolved upstream, and in the Ubuntu Cosmic distribution, however the backport has not yet hit Ubuntu Bionic. I’ve been able to safely apply this commit to the dogtag.py file at /usr/lib/python2.7/dist-packages/ipapython, then restarted FreeIPA and all was well.

root@ipa:~# ipactl restart
Stopping pki-tomcatd Service
Restarting Directory Service
Restarting krb5kdc Service
Restarting kadmin Service
Restarting named Service
Restarting httpd Service
Restarting ipa-custodia Service
Restarting pki-tomcatd Service
Restarting ipa-otpd Service
Restarting ipa-dnskeysyncd Service
ipa: INFO: The ipactl command was successful

Ubuntu Bionic (actually cloud-init) Reverting Hostname on Reboot

If you’ve changed the hostname on an Ubuntu Bionic install, restarted the node, then found that the hostname has reverted you may be wondering why this has happened. The problem actually stems from the cloud-init scripts and the ‘preserve_hostname’ option.

root@ipa:~# grep -H -n preserve /etc/cloud/cloud.cfg
/etc/cloud/cloud.cfg:15:preserve_hostname: false

Go change the variable to true and the next time you change the hostname and reboot it will be left intact.

FreeIPA WebUI Login Fails with “Login failed due to an unknown reason.”

I’ve been working with setting up a fresh install of my homelab and have been trying to get FreeIPA to work on Ubuntu Bionic. If you happen to see the “Login failed due to an unknown reason.” error while trying to login through the web UI, try adding execute permissions for all users to the “/var/lib/krb5kdc/” directory.

root@ipa:~# chmod a+x /var/lib/krb5kdc

Try to login after that and, if the problem was the same as my own, you’ll find it working now.